mim-solutions / bert_for_longer_texts

BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
Other
129 stars 30 forks source link

Any example colab scripts to fine tune BERT variations for text multi-class classification tasks? #19

Closed sjinjala23 closed 10 months ago

sjinjala23 commented 1 year ago

My data has tokens more than 512 and I need to train a bert model (ALBERT base v2) for multiclass text classification task. I can't seem to find any example colab scripts in the repo. Kindly provide with some links or articles.

mwachnicki commented 1 year ago

Currently, we only support binary classification. We may add multiclass classification in the future. You can also fork the repo and modify the code by yourself.