emorynlp / nlp4j-old

NLP tools developed by Emory University.
Other
60 stars 19 forks source link

ArrayIndexOutOfBoundException in NER Training #18

Closed ShafeenaBasheer closed 8 years ago

ShafeenaBasheer commented 8 years ago

Hi, We are facing a problem in NLP training in "ner" mode.The command we used is following .

$ java -Xmx1g -XX:+UseConcMarkSweepGC edu.emory.mathcs.nlp.bin.NLPTrain -mode ner -c config-train-sample.xml -t train.tsv -d sample-dev.tsv -m sample-dep.xz.

train.tsv that we used is following: 1 Peruvanthanam peruvanthanam NNP pos2=NN 0 root _ U-GPE.

The Exception we are getting is: java.lang.ArrayIndexOutOfBoundsException: 0at edu.emory.mathcs.nlp.learning.optimization.OnlineOptimizer.getPredictedLabelHingeLoss(OnlineOptimizer.java:239) at edu.emory.mathcs.nlp.learning.optimization.method.AdaGradMiniBatch.getPredictedLabel(AdaGradMiniBatch.java:50) at edu.emory.mathcs.nlp.learning.optimization.OnlineOptimizer.train(OnlineOptimizer.java:176) at edu.emory.mathcs.nlp.learning.optimization.OnlineOptimizer.train(OnlineOptimizer.java:167) at edu.emory.mathcs.nlp.component.template.OnlineComponent.process(OnlineComponent.java:201) at edu.emory.mathcs.nlp.component.template.OnlineComponent.process(OnlineComponent.java:173) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.iterate(OnlineTrainer.java:299) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:229) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:200) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:187) at edu.emory.mathcs.nlp.bin.NLPTrain.train(NLPTrain.java:77) at edu.emory.mathcs.nlp.bin.NLPTrain.main(NLPTrain.java:117)

Please help . Thanks in advance.

jdchoi77 commented 8 years ago

You need to indicate the nament field in the configuration file. Please use the following tsv configuration in your configuration file:

    <tsv>
        <column index="1" field="form"/>
        <column index="2" field="lemma"/>
        <column index="3" field="pos"/>
        <column index="4" field="feats"/>
        <column index="5" field="dhead"/>
        <column index="6" field="deprel"/>
        <column index="8" field="nament"/>
    </tsv>
ShafeenaBasheer commented 8 years ago

Thanks for your support. We will check and get back soon.

brew42 commented 8 years ago

Hi

I am also trying to use NPLTrain in ner mode. I have been using the file attached but still get the error below.

Command ./bin/nlptrain -c config-train-ner.xml -mode ner -t sample-trn.tsv -d sample-dev.tsv -m sample-dep.xz

Error java.lang.IllegalArgumentException: No enum constant edu.emory.mathcs.nlp.component.template.util.BILOU.2 at java.lang.Enum.valueOf(Enum.java:238)

And ideas?

It does generate an output. See sample-dep.xz attached. Is there anyway of previewing this to approve the content?

Also when i manage to generate the output I was wondering which file this would replace in my configuration file - does it replace this? sample-dep.xz.zip

edu/emory/mathcs/nlp/lexica/en-named-entity-gazetteers-simplified.xz

Tom config-train-ner.xml.zip

saravanakumar1 commented 7 years ago

hi, i am also trying to train a ner model but i am getting different error java.lang.NullPointerException at edu.emory.mathcs.nlp.component.template.util.BILOU.collectEntityList(BILOU.java:86) at edu.emory.mathcs.nlp.component.template.util.BILOU.collectEntityMap(BILOU.java:61) at edu.emory.mathcs.nlp.component.ner.NERState.evaluate(NERState.java:60) at edu.emory.mathcs.nlp.component.template.OnlineComponent.process(OnlineComponent.java:199) at edu.emory.mathcs.nlp.component.template.OnlineComponent.process(OnlineComponent.java:161) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.iterate(OnlineTrainer.java:214) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.iterate(OnlineTrainer.java:195) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.evaluate(OnlineTrainer.java:187) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:162) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:123) at edu.emory.mathcs.nlp.bin.NLPTrain.(NLPTrain.java:59) at edu.emory.mathcs.nlp.bin.NLPTrain.main(NLPTrain.java:64)

Can someone help me in this?