tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.35k stars 3.47k forks source link

[Question]: NER with the Transformer #1103

Open mabergerx opened 5 years ago

mabergerx commented 5 years ago

Description

After successfully using the Transformer for my own translation task I was wondering if this powerful model would also perform for a NER (Named Entity Recognition) task? I was thinking as modelling this task as a seq2seq problem with sequences like:

input ---------------> output This car is a Volvo --> O O O O ORGANISATION

ideally, at the same time I want to be able to recognise entities regardless of case. So, two questions: 1) Would this kind of approach intuitively work with the Transformer architecture? is my thinking correct here? 2) Would it make sense to feed a copy of my dataset with everything lowercased into the model to account for lowercase or would this data duplication be harmful/useless?

Just wanting to hear some opinions!

lkluo commented 5 years ago

Transformer shall be used for most of seq2seq problem including named entity recognition. I am thinking it would be better not to lowercase the sentence as for most of capitalised words their chance of being named entity is high.

martinpopel commented 5 years ago

In NER, you need one output tag (e.g. in BIO encoding) for each input word. This usually means that your input is already tokenized. If this is the case, you have three options:

You can also try character-based models (it makes sense for NER).

Capitalization is one of the most important features for NER, so lowercasing everything is definitely a bad idea. Character-based models will surely learn the difference between lowercase and uppercase automatically (and even subword-based models, I think).

yumath commented 5 years ago

However, when I only use Transformer Encoder with a softmax layer on top, the model tends to set all labels to O.

wenhrui commented 5 years ago

However, when I only use Transformer Encoder with a softmax layer on top, the model tends to set all labels to O.

Have you solved your problem?I got the same problem.

yumath commented 5 years ago

@wenhrui no, I haven't.

Saichethan commented 5 years ago

How and where to use the Transformer for NER task (I have implemented using CNN+Bi-LSTM+CRF)

niranjan8129 commented 5 years ago

I am having same issue .. @Saichethan can you share the github if you solved

qq547276542 commented 4 years ago

TENER: Adapting Transformer Encoder for Name Entity Recognition https://arxiv.org/pdf/1911.04474.pdf