Hello,
Due to external constraints, I am unable to download the preprocessed data in Google Drive. However, I do have the raw .story files on hand, so I was going through the steps to preprocess the data myself. To start, I used 250 story files to make sure the steps work. Step 3 worked like a charm. However, when running Step 4, while it generates the json-line files fine, it results in an error the very end:
Traceback (most recent call last):
File "preprocess.py", line 63, in <module>
eval('data_builder.'+args.mode + '(args)')
File "<string>", line 1, in <module>
File "..../BertSum/src/prepro/data_builder.py", line 315, in format_to_lines
with open(pt_file, 'w') as save:
FileNotFoundError: [Errno 2] No such file or directory: '../json_data/cnndm.train.0.json'
Steps to Reproduce
Follow steps 1-3 for preprocessing the data yourself in the README. Specifically, the command given in Step 4 gives the error at the end.
RAW_PATH is the directory containing tokenized files (../merged_stories_tokenized), JSON_PATH is the target directory to save the generated json files (../json_data/cnndm), MAP_PATH is the directory containing the urls files (../urls)
Issue
Hello, Due to external constraints, I am unable to download the preprocessed data in Google Drive. However, I do have the raw
.story
files on hand, so I was going through the steps to preprocess the data myself. To start, I used 250 story files to make sure the steps work. Step 3 worked like a charm. However, when running Step 4, while it generates the json-line files fine, it results in an error the very end:Steps to Reproduce
Follow steps 1-3 for preprocessing the data yourself in the README. Specifically, the command given in Step 4 gives the error at the end.
Thank you!