glample / tagger

Named Entity Recognition Tool
Apache License 2.0
1.16k stars 426 forks source link

Support for using additional gazetteers as features #8

Closed metpallyv closed 6 years ago

metpallyv commented 8 years ago

Hi,

Nice work on the implementation. I had a question. I am trying to train my lstm-crf model with external word2vec embeddings + char bi-lstm features + word-lstm features + few gazetteer features and there is no change in the accuracy on test data set by using addtional gazetteer features. So wanted to know if the code supports using additional gazetteer features currently or just w2v embeddings + char lstm + word lstm as features?

glample commented 8 years ago

Hi,

I tried once to add gazetteers, and it was improving the score a little (not too much in English, but I think it was significant for some of the 4 CoNLL languages). However, I tried to simplify the code here and I didn't include this optional feature...

metpallyv commented 8 years ago

Thanks for the reply @glample . However for my task, I am planning on adding the gazetteer features as well as I believe it could improve my results. If you still have the gazetteers code, let me know if you could provide it(Atleast something to start off). Else, if I plan to make changes to the repo to add gazetteers features as well and send a PR, so that if somebody else needs to use it, they can.

glample commented 8 years ago

Unfortunately I won't have much time this summer to work on this, but if you want to update the code and send a PR that would be great. I can help you with that. I can also send you a version of the code I used when I was using gazetteers, it's kind of dirty, but the code related to the gazetteers part is pretty short and should be easy to add to the code of this repo.

metpallyv commented 8 years ago

@glample . If you could send me that dirty bit of code for gazetteers that would be helpful. I can make changes to them and send a PR. Should not a problem from my end at all.

Janiknoah commented 7 years ago

Dear @glample, I guess the update for the gazetters didnt happen. could you send me the code as well as a starting point? thx

metpallyv commented 7 years ago

Hey JuliMinou,

I had made the code changes. But, never checked in the code i guess. Let me just check in the code soon. Give me a day or two.

Thanks, Vardhaman

On Sun, Feb 5, 2017 at 8:54 PM, JuliMinou notifications@github.com wrote:

Dear @glample https://github.com/glample, I guess the update for the gazetters didnt happen. could you send me the code as well as a starting point? thx

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/glample/tagger/issues/8#issuecomment-277586654, or mute the thread https://github.com/notifications/unsubscribe-auth/ANs2oQ88HsJS_V75hejLsrVhlqIdjkYAks5rZqeigaJpZM4JJqPn .

Janiknoah commented 7 years ago

Hi @metpallyv, uh nice thx. Since I am very new to the whole topci, could you add a brief description what kind of file is needed for the gazetters?