TsinghuaAI / CPM-2-Finetune

Finetune CPM-2
MIT License
83 stars 21 forks source link

数据集怎么处理,我下载了LSCTS数据集,运行程序后报错。 #32

Closed Chunhui-Zou closed 2 years ago

Chunhui-Zou commented 2 years ago

Traceback (most recent call last): File "/amax/home/zouchunhui/CPM-2-Finetune-master/finetune_cpm2.py", line 693, in main() File "/amax/home/zouchunhui/CPM-2-Finetune-master/finetune_cpm2.py", line 664, in main train_dataloader, train_dataset = load_data(args, data_config, 'train', tokenizer, prompt_config, ratio=args.train_ratio, num=args.train_num) File "/amax/home/zouchunhui/CPM-2-Finetune-master/finetune_cpm2.py", line 540, in load_data prompt_config=prompt_config) File "/amax/home/zouchunhui/CPM-2-Finetune-master/CPM2Datasets.py", line 440, in init super(LCSTSDataset, self).init(args, tokenizer, path, split, ratio, num, prefix, add_target_post, cache_path, do_infer, prompt_config) File "/amax/home/zouchunhui/CPM-2-Finetune-master/CPM2Datasets.py", line 35, in init self.data, self.max_enc_len, self.max_dec_len = self.process_data() File "/amax/home/zouchunhui/CPM-2-Finetune-master/CPM2Datasets.py", line 450, in process_data obj = json.loads(line) File "/amax/home/zouchunhui/anaconda3/envs/py36/lib/python3.6/json/init.py", line 354, in loads return _default_decoder.decode(s) File "/amax/home/zouchunhui/anaconda3/envs/py36/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/amax/home/zouchunhui/anaconda3/envs/py36/lib/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Hzfinfdu commented 2 years ago

可能是因为你下载的数据集是txt格式或者tsv格式的吧?可以在CPM2Datasets里面把json对应的代码变成读你的格式的就行,如果你还没解决可能会帮上你