nlpyang / PreSumm

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
MIT License
1.29k stars 465 forks source link

error in step 3 #248

Open Marwan1137 opened 11 months ago

Marwan1137 commented 11 months ago

(test) PS C:\Users\marwa\Downloads\Compressed\PreSumm-master\src> python preprocess.py -mode tokenize -raw_path "C:\Users\marwa\Downloads\Compressed\PreSumm-master\cnn\stories" -save_path "C:\Users\marwa\Downloads\Compressed\PreSumm-master\merged stories" Preparing to tokenize C:\Users\marwa\Downloads\Compressed\PreSumm-master\cnn\stories to C:\Users\marwa\Downloads\Compressed\PreSumm-master\merged stories... Making list of files to tokenize... Tokenizing 304356 files in C:\Users\marwa\Downloads\Compressed\PreSumm-master\cnn\stories and saving in C:\Users\marwa\Downloads\Compressed\PreSumm-master\merged stories... Error: Could not find or load main class edu.stanford.nlp.pipeline.StanfordCoreNLP Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.pipeline.StanfordCoreNLP Stanford CoreNLP Tokenizer has finished. Traceback (most recent call last): File "preprocess.py", line 73, in eval('data_builder.'+args.mode + '(args)') File "", line 1, in File "C:\Users\marwa\Downloads\Compressed\PreSumm-master\src\prepro\data_builder.py", line 137, in tokenize tokenized_stories_dir, num_tokenized, stories_dir, num_orig)) Exception: The tokenized stories directory C:\Users\marwa\Downloads\Compressed\PreSumm-master\merged stories contains 0 files, but it should contain the same number as C:\Users\marwa\Downloads\Compressed\PreSumm-master\cnn\stories (which has 304356 files). Was there an error during tokenization?

WSChange commented 8 months ago

check you library.use the same library as the author's use