microth / mateplus

Extension of the mate-tools NLP pipeline
GNU General Public License v2.0
67 stars 16 forks source link

No identified predicates #7

Closed jspreston closed 6 years ago

jspreston commented 6 years ago

Hi, I've set up mateplus as described in the README, but when I run the parse.sh script on a set of raw sentences (one sentence per line), I get a correctly parsed output file but the SRL component does not indicate that it processed any sentences and the output does not contain any identified predicates or arguments. The output of running parse.sh is below:

8.29.407   is2.data.ParametersFloat -1:read ->         read parameters 4000001 not zero 275524
8.29.411   is2.data.Cluster -1:<init> ->               Read cluster with 0 words
8.29.411   is2.lemmatizer.Lemmatizer -1:readModel ->   Loading data finished.
8.29.411   is2.lemmatizer.Lemmatizer -1:readModel ->   number of params  4000001
8.29.411   is2.lemmatizer.Lemmatizer -1:readModel ->   number of classes 89
8.29.867   is2.data.ParametersFloat -1:read ->         read parameters 4000001 not zero 1321879
8.29.867   is2.data.Cluster -1:<init> ->               Read cluster with 0 words
8.29.868   is2.tag.Lexicon -1:<init> ->                Read lexicon with 0 words
8.29.868   is2.tag.Tagger -1:readModel ->              Loading data finished.
8.29.881   is2.parser.Parser -1:readModel ->           Reading data started
8.29.903   is2.data.Cluster -1:<init> ->               Read cluster with 0 words
8.31.176   is2.parser.ParametersFloat -1:read ->       read parameters 12000001 not zero 9789656
8.31.177   is2.parser.Parser -1:readModel ->           parsing -- li size 12000001
8.31.182   is2.parser.Parser -1:readModel ->           Stacking false
8.31.182   is2.parser.Extractor -1:initStat ->         mult  (d4)
Used parser   class is2.parser.Parser
Creation date 2013.11.14 01:13:28
Training data /home/bohnetb/corpora/connl_09_st_eng_train/CoNLL2009-ST-English-train.txt
Iterations    10 Used sentences 10000000
Cluster       null
8.31.184   is2.parser.Parser -1:readModel ->           Reading data finnished
8.31.184   is2.parser.Extractor -1:initStat ->         mult  (d4)
Loading pipeline from models/srl-EMNLP14+fs-eng.model
Loading reranker from models/srl-EMNLP14+fs-eng.model
Writing corpus to myout.txt...
Processing sentence 100
Processing sentence 200
Memory usage:
Allocated:          2,340,864kb
Used:               1,362,581kb
Free:               978,283kb
Free (after gc call):   820,613kb

Tokenizer: StanfordPTBTokenizer
Tokenizer time:  120
Lemmatizer time: 569
Tagger time:     1,190
MTagger time:    0
Parser time:     5,350

Time spent doing semantic role labeling (ms): 0

Semantic role labeler started at Wed Apr 18 05:08:31 EDT 2018
Time spent loading SRL models (ms)      0
Time spent parsing semantic roles (ms)      0

Number of sentences 0
Number of predicates    0
SRL speed (ms/sen)  NaN
Reranker status:
AI beam:        4
AC beam:        4
Alfa:           1.0

Reranker choices:
Rank    Frequency
1   0
2   0
3   0
4   0
5   0
6   0
7   0
8   0
9   0
10  0
11  0
12  0
13  0
14  0
15  0
16  0

Number of zero size argmaps:    0

Total parsing time (ms):  7,301
Overall speed (ms/sen):   27

is there something I'm doing incorrectly?

jspreston commented 6 years ago

Ah, I found my mistake. I had downloaded the anna-3.61.jar, not the required anna-3.3.jar. Predicates and arguments are now being labeled.