bdzyubak / torch-control

A top-level repo for evaluating natively available models
MIT License
2 stars 0 forks source link

Evaluate fine tuning only the sentiment head of sentiment analysis models #16

Closed bdzyubak closed 2 months ago

bdzyubak commented 3 months ago

LLM models mostly consist of pretrained language interpretation weights. A small head consisting of a couple of layers is used to customize the model for a given task - sentiment analysis, sentence completion, information extraction. If all weights are tuned, there is a substantial risk of overfitting on the small amount of training data and loosing the ability of the model to interpret language generally. Implement code to freeze all layers except the initialized detector head and only fine tune the head.

Start with DistilBERT where I have already fine tuned all layers (and did see substantial overfitting after the first epoch). Extend to the other models that will be implemented in https://github.com/bdzyubak/torch-control/issues/15.

bdzyubak commented 2 months ago

Added code to freeze all layers but the classification head of LLMs. When importing from transformers, differently named heads are created for various models and the layers to keep training need to be specified for each model.

When the model is allowed to tune fully, the training accuracy goes up to 0.9 but the val accuracy falls quickly after an initial spike as the model overfits and looses pretrained knowledge. image

Training just the classification head is much more well behaved with the validation accuracy increasing up to a plateau. However, train accuracy is only 0.65 indicating underfitting. TODO: Implement ability to add a custom classification head with specified complexity, and try training with more weights. image

The BERT base was implemented in this branch, together with WIP code for RoBERTa and LLama-2. Turn off the latter as not implemented until finalized and merged. End scope creep here, and raise a separate issue to finalize.