-
As of now, the Avg Perceptron POS model added in #131 gives ~60 per cent accuracy [on CoNLL 2003](https://github.com/Ayushk4/POS.jl/blob/master/valid/CoNLL.ipynb) which is decent for the 30+ classes i…
-
File format conversion: conll to our annotations:
* basic conll 2003 format: token / whitespace / IBO-code
* IBO-code is something like I-PER, B-LOC, O
* use heuristics/rules for whitespace in…
-
For the data used for POS tagging and Dependency Parsing, our data format follows the CoNLL-X format. Following is an example:
1 No _ RB RB _ 7 discourse _ _
2 , _ , , _ 7 punct _ _
3 it _ PR PRP _…
-
When verifying this notebook in #21 , I got a div by 0 error here https://github.com/CODAIT/Identifying-Incorrect-Labels-In-CoNLL-2003/pull/21#discussion_r516238485 and could not get the correct outpu…
-
Hi,
What evaluation metric you use here? entity level or token level? Do you consider exact matching scheme or partial matching?
-
Hi,
Thank you for your excellent work.
As you described in the paper, you have adopted BIOES tagging scheme in the experiment. However, it seems that the CoNLL 2003 NER dataset is annotated by BIO…
-
There exist some false annotation problems. The annotation in weiboNER_2nd_conll.* files is consistent with weiboNER.conll.*. It seems like that there exist tens of falsely annotated problems. Can you…
-
我尝试运行train.py在CoNLL-2003数据集,但是遇到了如下问题:
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogen…
-
Hi,
I have inspected the data set here.
https://github.com/synalp/NER/blob/master/corpus/CoNLL-2003/eng.testa
I found for the entity tags there is no "B-", is there something wrong?
Thanks
-
Is it necessary for the newly added data set to be in either CoNLL-2003 or BRAT format?Will a simple amazon review data set file be able to work fine? If not kindly share a method for conversion into…