Digital-Defiance / nlp-metaformer

An ablation study on the transformer network for Natural Language Processing
3 stars 0 forks source link

sentiment analysis: data set too small, model easily overfits #8

Closed RuiFilipeCampos closed 5 months ago

RuiFilipeCampos commented 5 months ago

current dataset has 50k entries

RuiFilipeCampos commented 5 months ago

https://www.kaggle.com/datasets/kazanova/sentiment140

RuiFilipeCampos commented 5 months ago

https://www.kaggle.com/datasets/bittlingmayer/amazonreviews/data

RuiFilipeCampos commented 5 months ago

I suspect that using a pre-trained tokenizer and pre-calculated positional encoding would help a lot

a simple feed forward on top of the learnable tokenizer is enough to overfit

RuiFilipeCampos commented 5 months ago

https://github.com/Digital-Defiance/llm-voice-chat/releases/tag/dataset-release-amazon-reviews