nlpyang / PreSumm

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
MIT License
1.29k stars 465 forks source link

issue for converting to bert_data #240

Closed connie-n closed 2 years ago

connie-n commented 2 years ago

I tried to json data to bert data(data preprocessing step 5) but it returned empty file.. could anyone help me?

I run this code with both cnndm and my own data. cnndm data was converted to bert data successfully but my own data wasn't. I adjusted the min_src_tokens/min_tgt_tokens but it is still not working.

제목 없음

kush-2418 commented 2 years ago

in data_builder.py file and in format_to_bert function do this -

for json_f in glob.glob(pjoin(args.raw_path, '.' + corpus_type + '.[0-9]*.json')):

yassminSaber commented 9 months ago

@connie-n how did you solve this problem ?