eg-nlp-community / nlp-reading-group

12 stars 0 forks source link

[29/03/2020] 4pm GMT+2 - Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition #5

Closed Omarito2412 closed 4 years ago

Omarito2412 commented 4 years ago

Join us at our discussion of Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition URL: https://www.aclweb.org/anthology/N19-1117/

On Sunday, 29th of March at 4 PM (GMT +2) Hangout: https://hangouts.google.com/group/kUxBAunjGittAkBUA

Omarito2412 commented 4 years ago

Incorporating world knowledge into models is a neat trick in my opinion. Maybe the edge here is not the unsupervised NER aspect but the language model leveraging data from KBs that might be useful to understand rare words/concepts that are under-represented.

hadyelsahar commented 4 years ago

NAACL2019 accepted paper, discussing incorporation of Named entity types into the training of LSTM Language models. Where model predicts a type given the context, and the target word given the predicted type and the context.

image

Dictionaries of Named entity types from KBs can be incorporated top this model in this paper is called "Type Prior" and one way of incorporating this is through fixing P(type|token) it during decoding.

image

I find this approach useful for unsupervied training NER taggers in domains where the only resource you have is a dictionaries of Named entities without a good NER tagger.

Drawback: is they should have compared with a multi-task model where the model is asked to predict P(type|latent) and P(token|latent) and training for P(type|latent) could be determined using a pseudo dataset created with the dictionary.

Drawback 2: The usage of the world "knowledge bases" is an overstatement in this paper but this is a common thing I assume in the NLP community.

We had good discussions of this paper such as their usability for language modeling on medical domain.