Open TianlinZhang668 opened 5 years ago
i run the corenlp-3.9.2.jar
You need stanford-corenlp-3.7.0.jar
. See this: https://github.com/abisee/cnn-dailymail#2-download-stanford-corenlp
Please read the README.md file.
Successfully finished tokenizing /home/ztl/Downloads/cnn_stories/cnn/stories to cnn_stories_tokenized.
Making bin file for URLs listed in url_lists/all_test.txt...
Traceback (most recent call last):
File "make_datafiles.py", line 239, in
i have got the tokenized, but next ....
Try this: https://github.com/JafferWilson/Process-Data-of-CNN-DailyMail Guess it will solve your tokenization and rest other issues.
if I have content of the article that isn't the same as structure of the CNN's article
@quanghuynguyen1902 Guess you already have opened a new issue https://github.com/abisee/cnn-dailymail/issues/29 Lets go there. Please someone close this issue.
I am facing the same issue in here.
source ./.bash_profile
i run makedatafiles.py. but it has an error: Preparing to tokenize /home/ztl/Downloads/cnn_stories/cnn/stories to cnn_stories_tokenized... Making list of files to tokenize... Tokenizing 92579 files in /home/ztl/Downloads/cnn_stories/cnn/stories and saving in cnn_stories_tokenized... Error: Could not find or load main class edu.stanford.nlp.process.PTBTokenizer Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.process.PTBTokenizer Stanford CoreNLP Tokenizer has finished. Traceback (most recent call last):
However i can run echo "Please tokenize this text." | java edu.stanford.nlp.process.PTBTokenizer in the root i dont know how to deal with? thanks a lot