allanj / pytorch_neural_crf

Pytorch implementation of LSTM/BERT-CRF for named entity recognition
359 stars 62 forks source link

Learning on Heterogeneous Tag Sets using Tag Hierarchy #38

Closed NiteshMethani closed 2 years ago

NiteshMethani commented 2 years ago

Hi, Could you suggest edits on how to extend this repository to do NER on disjoint or heterogeneous tag sets as described in this paper: https://aclanthology.org/P19-1014/

The basic idea is to create a tag hierarchy and train the NER architecture where CRF is replaced with Marginal CRF (https://aclanthology.org/D18-1306/). Any ideas around its implementation would be highly appreciated.

Thanks!

allanj commented 2 years ago

I think they are pretty straightforward.

  1. You can create different linear layers for different tag set.

  2. For the marginal CRF, it is also not complicated, but you just need to use the forward unlabel function https://github.com/allanj/pytorch_neural_crf/blob/a27c9c08c290ffeb1884428b9fc0f70c92c234f2/src/model/module/linear_crf_inferencer.py#L78

  3. Probably use a mask to denote what are the valid tags, and what are the invalid tags