AlexPoint / OpenNlp

Open source NLP tools (sentence splitter, tokenizer, chunker, coref, NER, parse trees, etc.) in C#
MIT License
283 stars 101 forks source link

Bug in WordNetDictionary.cs Line 78 #37

Open plqplq opened 1 year ago

plqplq commented 1 year ago

The "N" should be a "V" on line 78

            string partOfSpeech;
            if (tag.StartsWith("N") || tag.StartsWith("n"))
            {
                partOfSpeech = "noun";
            }
            **else if (tag.StartsWith("N") || tag.StartsWith("v"))**
            {
                partOfSpeech = "verb";
            }
            else if (tag.StartsWith("J") || tag.StartsWith("a"))
            {
                partOfSpeech = "adjective";
            }
            else if (tag.StartsWith("R") || tag.StartsWith("r"))
            {
                partOfSpeech = "adverb";
            }
            else
            {
                partOfSpeech = "noun";
AlexPoint commented 1 year ago

Thanks for the feedback, but could you elaborate on which file it is and why you think so? If possible, you can also submit a pull request to fix this bug.

plqplq commented 1 year ago

NP. If you look at the first IF, it has the same condition, so the second condition can never execute with an "N". The issue is the "V" is taken as the sign that a word is a verb, ("N" for noun) with the alternative "v" in lowercase. So its just a typo that we have an N in there, and not having V in there means that verbs dont get processed.

As per the title, the file is .\OpenNlp-master\OpenNlp-master\OpenNLP\Tools\Coreference\Mention\WordNetDictionary.cs and the error is on line 78,

I did find other problems with the library. I had to hack it a fair bit to get synonyms for verbs, and the above was one part of that. In the end I gave up and moved to catalyst.

I won't do a pull request as I've now removed the project and am using a different c# NLP now

thanks Paul