flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.88k stars 2.1k forks source link

Colab freezes with PooledFlairEmbeddings #1435

Closed igormis closed 4 years ago

igormis commented 4 years ago

Describe the bug The colab stops on first iteration

To Reproduce `embedding_types: List[TokenEmbeddings] = [

#embedding = RoBERTaEmbeddings(pretrained_model_name_or_path="roberta-base", layers="0,1,2,3,4,5,6,7,8,9,10,11,12", pooling_operation="first", use_scalar_mix=True)
WordEmbeddings('glove'),
#WordEmbeddings('flairembeddings'),
BertEmbeddings("distilbert-base-uncased"),
#RoBERTaEmbeddings(pretrained_model_name_or_path="roberta-base", layers="0,1,2,3,4,5,6,7,8,9,10,11,12",
#                          pooling_operation="first", use_scalar_mix=True),
# comment in this line to use character embeddings
#CharacterEmbeddings(),
PooledFlairEmbeddings('news-forward', pooling='min'),
PooledFlairEmbeddings('news-backward', pooling='min'),
# contextual string embeddings, forward
#FlairEmbeddings('news-forward'),

# contextual string embeddings, backward
#FlairEmbeddings('news-backward'),

]`

Expected behavior It is expected to finish with 150 iteration

Screenshots 2020-02-18 19:05:50,886 epoch 1 - iter 33/335 - loss 50.98124013 - samples/sec: 38.50 2020-02-18 19:06:14,781 epoch 1 - iter 66/335 - loss 44.30798718 - samples/sec: 44.34 2020-02-18 19:06:41,271 epoch 1 - iter 99/335 - loss 40.18037058 - samples/sec: 40.00 2020-02-18 19:07:06,270 epoch 1 - iter 132/335 - loss 37.21505756 - samples/sec: 42.38 2020-02-18 19:07:30,189 epoch 1 - iter 165/335 - loss 34.58314475 - samples/sec: 44.28 2020-02-18 19:07:57,314 epoch 1 - iter 198/335 - loss 32.73812305 - samples/sec: 39.04

Environment (please complete the following information):

Additional context Add any other context about the problem here.

EngSalem commented 4 years ago

PooledFlair is based on pooling aggregated embeddings stored in memory during processing the batch, colab only has 12 GBs of RAM so it won't be sufficient.

igormis commented 4 years ago

Tnx, is there any workaround?

alanakbik commented 4 years ago

I don't believe so, the pooled embeddings are very memory hungry.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.