JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.
MIT License
1.3k stars 100 forks source link

Fix automodels #18

Closed Randl closed 1 year ago

Randl commented 1 year ago

This fixes the AutoModelForSequenceClassification constructor and also adds AutoModelForTokenClassification. Fixes minor bugs/typos too.

JonasGeiping commented 1 year ago

Hi, it's not clear what this fixes for the AutoModelForSequenceClassification can you provide some more details there? Otherwise, thanks, this looks great!

Randl commented 1 year ago

The current version crashes because num_labels is populated in different place by AutoModel. Also, init function is expected to have a parameter to initialize specific module. This fixes these two things.