utterworks / fast-bert

Super easy library for BERT based NLP models
Apache License 2.0
1.85k stars 342 forks source link

Multilabel fine-tuning fixes #269

Open v-ko opened 3 years ago

v-ko commented 3 years ago

Sorry for the lazy "commit", but I'm short on time to figure out how to push a local commit to the repo as a pull request (I guess I need to fork, etc).

The multilabel BERT fine-tuning doesn't work ATM. Testing done via sample_notebooks/new-toxic-multilabel.ipynb . The fixes I applied are as follows: In sample_notebooks/new-toxic-multilabel.ipynb fixed DATA_LABEL and DATA_PATH:

DATA_PATH = Path('../sample_data/multi_label_toxic_comments/data')
LABEL_PATH = Path('../sample_data/multi_label_toxic_comments/label')

In fast_bert/learner_cls.py fixed lines 146 and 147 (pos_weight and weight were undefined, should be taken from the dataBunch object):

model_class[1].pos_weight = dataBunch.pos_weight
model_class[1].weight = dataBunch.weight

Optional: Had a problem loading Spicy and removed it as an unneeded dependency in fast_bert/data_lm.py.