c-amr / camr

Transition-based tree-to-graph AMR Parser
GNU General Public License v2.0
125 stars 45 forks source link

Problem of the input data format in preprocessing. #25

Open Rainbow0625 opened 4 years ago

Rainbow0625 commented 4 years ago

"The input data format for parsing should be raw document with one sentence per line."

I put a sentence in a file without a suffix ending in a period like the above, but the files after preprocessing are all 0 bytes. Why is that?

Please help me, thank you very much!!!!

Rainbow0625 commented 4 years ago

After I change the suffix of the input file to 'xxx.sent', there is a new error:

Start Stanford CoreNLP... java -Xmx2500m -cp stanfordnlp/stanford-corenlp-full-2015-04-20/stanford-corenlp-3.5.2.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/stanford-corenlp-3.5.2-models.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/joda-time.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/xom.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/jollyday.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/protobuf.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/javax.json.jar:stanfordnlp/stanford-corenlp-full-2015-04-20/ejml-0.23.jar edu.stanford.nlp.pipeline.StanfordCoreNLP -props stanfordnlp/default.properties Loading Models: 4/4
Read token,lemma,name entity file rawData.sent.prp...

[ERROR] Timeout Traceback (most recent call last): File "/Users/Rainbow/Desktop/AMR/AMRParsing/stanfordnlp/corenlp.py", line 508, in parse data = parse_parser_results_new(result) File "/Users/Rainbow/Desktop/AMR/AMRParsing/stanfordnlp/corenlp.py", line 154, in parse_parser_results_new seqs = re.split("\r\n", text) File "/anaconda3/lib/python3.7/re.py", line 213, in split return _compile(pattern, flags).split(string, maxsplit) TypeError: expected string or bytes-like object

Traceback (most recent call last): File "amr_parsing.py", line 437, in main() File "amr_parsing.py", line 170, in main instances = preprocess(amr_file,START_SNLP=True,INPUT_AMR=args.amrfmt, PRP_FORMAT=args.prpfmt) File "/Users/Rainbow/Desktop/AMR/AMRParsing/preprocessing.py", line 439, in preprocess instances = proc1.parse(tmp_sent_filename) File "/Users/Rainbow/Desktop/AMR/AMRParsing/stanfordnlp/corenlp.py", line 511, in parse raise e File "/Users/Rainbow/Desktop/AMR/AMRParsing/stanfordnlp/corenlp.py", line 508, in parse data = parse_parser_results_new(result) File "/Users/Rainbow/Desktop/AMR/AMRParsing/stanfordnlp/corenlp.py", line 154, in parse_parser_results_new seqs = re.split("\r\n", text) File "/anaconda3/lib/python3.7/re.py", line 213, in split return _compile(pattern, flags).split(string, maxsplit) TypeError: expected string or bytes-like object