Significance of reproject_words in DocumentRNNEmbeddings

flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

https://flairnlp.github.io/flair/

Other

13.88k stars 2.1k forks source link

Significance of reproject_words in DocumentRNNEmbeddings #691

Closed bhavikm closed 4 years ago

bhavikm commented 5 years ago

Hi,

What is the significance of the reproject_words argument in DocumentRNNEmbeddings and is there some recommended usage?

What affect will this reprojection have in the case of contextual embeddings (eg. BERT) versus fixed embeddings (eg. word2vec)?

I haven't seen much information in the docs/code or other issues about this.

Thanks for your help.

alanakbik commented 5 years ago

Hello @bhavikm thanks for asking this question. Reproject words if set to True adds a fully connected layer after embedding the words and before input representations into the document RNN. I.e. without reprojection the sequence is embed words -> document RNN and with embed words -> linear map -> document RNN. We do this for the reason illustated in #690.

As for recommended usage I would probably reproject by default but try out both options to make sure. We are still experimenting a lot with different parameters - I'll share our experience once we have good recommendations. We'll also appreciate it if you and others share your experience if / whether reprojection makes sense for your use cases!

bhavikm commented 5 years ago

Thanks @alanakbik , I'll try it out and report back my results.

alanakbik commented 5 years ago

@bhavikm did you find a significant difference between the two options?

hjian42 commented 5 years ago

@alanakbik I found using the reprojection leads to performing worse than simply fine-tuning BERT for text classification. I think it makes sense because we need to re-initialize a lot of new parameters for this extra-layer of parameters. If we could include fine-tuning of BERT, it would probably improve the performance of some models more.

alanakbik commented 5 years ago

@emoryjianghang yes I agree, definitely something we should add!

hjian42 commented 5 years ago

@alanakbik Thank you guys for the great library. We look forward to the updates!

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

alanakbik commented 4 years ago

Fine-tuning transformers is added in #1492 and will be part of the next release.