I’m trying to evaluate a new model with LongBench and would like to load the datasets stored locally (downloaded and unzipped directly from HuggingFace). But whenever I’m reading the data with flag split=‘test’ in pred.py (say we are reading xxx.jsonl within the loop, the line is modded as data = load_dataset("json", data_files="/some/dir/xxx.jsonl", split="test") ), it will return a ValurError: Unknown split “test”. Should be one of [‘train’]. Is there any pre-processing I should perform on the downloaded data? Thanks in advance.
I’m trying to evaluate a new model with LongBench and would like to load the datasets stored locally (downloaded and unzipped directly from HuggingFace). But whenever I’m reading the data with flag split=‘test’ in pred.py (say we are reading xxx.jsonl within the loop, the line is modded as data = load_dataset("json", data_files="/some/dir/xxx.jsonl", split="test") ), it will return a ValurError: Unknown split “test”. Should be one of [‘train’]. Is there any pre-processing I should perform on the downloaded data? Thanks in advance.