JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.
MIT License
1.29k stars 100 forks source link

Finetuning for SQuAD task #35

Closed kisacats closed 11 months ago

kisacats commented 11 months ago

Hello,

First of all thank you for your labour for creating this work. I have pretrained crammed bert model with custom data and I want to know is it possible to use it for QA task. I tried register it as modified architecture of ScriptableLMForTokenClassification but I could not. Do you have any suggestion to finetune for QA taskespecially using as HF model?

JonasGeiping commented 11 months ago

Declaring it as AutoModelForTokenClassification should work. What's the error?

kisacats commented 11 months ago

Declaring it as AutoModelForTokenClassification should work. What's the error?

When declared it as AutoModelForTokenClassification it throws AttributeError: 'ScriptableLMForTokenClassification' object has no attribute 'num_labels'. After setting num_labels as two manually, the error becomes RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x768 and 1024x1)