NaturalNode / natural

general natural language facilities for node
MIT License
10.64k stars 859 forks source link

Named entity recognition #107

Open robdefeo opened 11 years ago

robdefeo commented 11 years ago

Do you have any plans for named entity recognition, I have seen that it would require a sequential classifier. The ability to train it with your own data set (json document) of POS tags and other key attributes.

sabatinim commented 10 years ago

This is very interesting!!! there is a plan?

kkoch986 commented 10 years ago

Don't currently have a plan, if anyone wants to tackle this it would be great though!

mbc1990 commented 9 years ago

Anyone have more thoughts on what algorithm might be best to implement for this?

liwenzhu commented 9 years ago

@mbc1990 I think crf is the best model for NER, the pipeline is tokenize -> pos tag -> NER, the challenge is you need find a NER training data, it's a hard work.

hbakhtiyor commented 8 years ago

any news of the feature?

gagan-bansal commented 7 years ago

+1

gagan-bansal commented 7 years ago

A detailed approach is given in nltk document for NER extraction.

diegodorgam commented 6 years ago

Hi there everyone, I was just studying this subject and found some real interesting stuff about NER that I want to share:

There some ways of doing this feature, the CharWNN seems to be the one with best results, but not by far. The others seems to need specific training corpus. For me it looks pretty similar to the PoS Tagger. I'm still not able to reproduce the algorithm detailed in those papers, also haven't found anything in javascript, only a few examples in python. Hope this will help to get any talented developer here inspired =)

Hugo-ter-Doest commented 6 years ago

I'm working on named entity recognition for natural. I'm working on three ways of recognition:

It will be possible to combine these approaches, so a hybrid approach. The methods returns a list of edges of the form (recognised string, start index, end index, category, score). Score only makes sense for the trained model. I'm thinking of using a maximum entropy model. Is that a viable route, any ideas on useful feature functions?

Hugo

diegodorgam commented 6 years ago

how is this going @Hugo-ter-Doest ? do you had any progress on this?

Hugo-ter-Doest commented 6 years ago

Yes, I did some work on this: https://github.com/Hugo-ter-Doest/natural/tree/NER

I am trying to make a hybrid approach. First find the easy to define and match entities with regular expressions and lexicons, then apply a statistical model to do more advanced detection.

Hugo

jseijas commented 6 years ago

Hi! Perhaps I can help with that, I did a NER but only for "enumerateds", with similar search, and my next step was to add regular expression entities (I see that you already had them!!! Great job!!!).

GeorgeNance commented 4 years ago

Are there any plans to incorporate this ?

dorgan commented 4 years ago

Any update on this?