Closed brew42 closed 7 years ago
So sorry for the late response. Could you send me the link of the configuration and the input dataset? I think there is a configuration issue. Thank.
best,
Jinho
Thanks Jinho
Configuration, training & dev files zipped & attached.
Tom
Hi Jinho
Were you able to review my configuration?
Thanks Tom
I haven't found time to do (sorry). I'll have some time this Wed so I'll let you know. Thanks for being patient.
best,
Jinho
Hey, so I ran into this same problem and from what I can tell, the problem for me was that the NER trainer expects every token in the training and development .tsv files to be labeled in BILOU notation. But if you look at sample-trn.tsv you can see that around line 63 it stops labeling tokens with the O to indicate they are outside an entity. I don't know that it was affecting the results of the NER, but adding the missing O's got rid of this error for me.
Note: I tested this on different files, but I'm guessing this might be the same problem.
Javier
Thanks Javier but looks like we are looking into spacey now.
Regards Tom
From: Javier Lores notifications@github.com Sent: 20 September 2016 01:11 To: emorynlp/nlp4j Cc: brew42; Author Subject: Re: [emorynlp/nlp4j] Nlp Training for NER (#11)
Hey, so I ran into this same problem and from what I can tell, the problem for me was that the NER trainer expects every token in the training and development .tsv files to be labeled in BILOU notation. But if you look at sample-trn.tsv you can see that around line 63 it stops labeling tokens with the O to indicate they are outside an entity. I don't know that it was affecting the results of the NER, but adding the missing O's got rid of this error for me.
Note: I tested this on different files, but I'm guessing this might be the same problem.
Javier
You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/emorynlp/nlp4j/issues/11#issuecomment-248165646, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAPb6PX1oIkj2sJB8sh6XjHH0_CC0WIuks5qrySlgaJpZM4Jdykl.
I guess the lack of developing time at the moment is hurting :( Sorry for not being so prompt.
best,
Jinho
Hi, I am trying to reduce size of dependency model. For that i need training data set. Can I get training and development data set that you used for creating dependency model. My email Id: shaileshtayde10@gmail.com
Hi
I am trying to use NPLTrain in ner mode. I have been using the file attached but get the error below.
Command ./bin/nlptrain -c config-train-ner.xml -mode ner -t sample-trn.tsv -d sample-dev.tsv -m sample-dep.xz
Error java.lang.IllegalArgumentException: No enum constant edu.emory.mathcs.nlp.component.template.util.BILOU.2 at java.lang.Enum.valueOf(Enum.java:238)
And ideas?
It does generate an output. See sample-dep.xz attached. Is there anyway of previewing this to approve the content?
Also when i manage to generate the output I was wondering which file this would replace in my configuration file - does it replace this?
edu/emory/mathcs/nlp/lexica/en-named-entity-gazetteers-simplified.xz
Tom
Attachments config-train-ner.xml.zip sample-dep.xz.zip