wasiahmad / PLBART

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
https://arxiv.org/abs/2103.06333
MIT License
186 stars 35 forks source link

Missing all.tok file when running data/github/preprocessing/preprocess.py #53

Closed kdwedage closed 1 month ago

kdwedage commented 7 months ago

I have downloaded the Java and Python Github data as outlined in the readme. I have attempted to run the preprocessing script following the command

python -m preprocessing.preprocess \ path_2_github_data \ --lang1 java \ --lang2 python \ --test_size 10000;

Upon completion I get the following error: FileNotFoundError: [Errno 2] No such file or directory: '../../../path_2_github_data/java/all.tok'

While the directory '../../../path_2_github_data/java/' does exist, I can confirm that the all.tok does not.