Add CRF layer after Transformer model

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

133.16k stars 26.59k forks source link

Add CRF layer after Transformer model #5017

Closed yuyan-z closed 4 years ago

yuyan-z commented 4 years ago

I've read a paper titled "Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF". It takes Transformer's output as CRF's input, as shown in the figure. Which function could I use to implement it? model.add() doesn't work. Screenshot_20200615_173206_cn wps moffice_eng

JetRunner commented 4 years ago

You can directly feed the output of a transformer's hidden states into a CRF implementation like this https://github.com/s14t284/TorchCRF

Hope it helps!

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

aixuedegege commented 4 years ago

I've read a paper titled "Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF". It takes Transformer's output as CRF's input, as shown in the figure. Which function could I use to implement it? model.add() doesn't work.

小哥哥实现了没

shushanxingzhe commented 2 years ago

This repository have showed how to add a CRF layer on transformers to get a better performance on token classification task. https://github.com/shushanxingzhe/transformers_ner

shaked571 commented 2 years ago

I don't think that this implementation is good. First, it doesn't take into account the fact that WP get padding index (usually -100) which is not expected in torchcrf and also it go over all the tags, also the one you won't use like the not first WP of a token (token==space separated string)

venti07 commented 2 years ago

Does anyone have a clean implementation of a BERTCRF? Preferably in a Jupyter notebook?

chansonzhang commented 1 month ago

You can directly feed the output of a transformer's hidden states into a CRF implementation like this https://github.com/s14t284/TorchCRF

Hope it helps!

@JetRunner Is there a counterpart implement in Tensorflow?

chansonzhang commented 1 month ago

I don't think that this implementation is good. First, it doesn't take into account the fact that WP get padding index (usually -100) which is not expected in torchcrf and also it go over all the tags, also the one you won't use like the not first WP of a token (token==space separated string)

@shaked571 which implementation are you referring to? and what is WP?

chansonzhang commented 1 month ago

@yuyan-z why the Decoder is necessary in the architecture illutrated in your graph?