Closed LiujunWang closed 4 years ago
Thank you for having an interest!
There is nothing wrong.
These studies identify p21ras as a target of the same cells .
This is a sentence including in the GENIA dataset, and the span "p21ras" belongs to two categories, "protein" and "DNA". This is derived from the original annotation.
In the original corpus (GENIAcorpus3.02.merged.xml), the above span "p21ras" is labeled as follows: <cons lex="p21ras" sem="G#DNA_domain_or_region"><cons lex="p21ras" sem="G#protein_molecule"><w c="NN">p21ras</w></cons></cons>
. This means that "p21ras" belongs to the two categories at least in this context.
Thanks for your reply, I read a lot of papers about nested ner and I find that almost all papers assume that a span should belong to one category, which seems naturally ordinary. But now it seems that I ignored something. Take the liberty to ask, did you consider this problem (a span may belong to two or more categories) in this paper?
Yes, our paper considers this problem. To my understanding, the following papers take it into account, too.
Some of the other papers our paper refers to might deal with two or more categories, but I cannot exactly tell which papers do.
Thank you very much, I understand more about the nested ner task.
You're welcome!
When parsing the GENIA dataset used in the code, some spans belong to two or more categories in the same sentence. Is there something wrong?