sentiment analysis: data set too small, model easily overfits

Digital-Defiance / nlp-metaformer

An ablation study on the transformer network for Natural Language Processing

3 stars 0 forks source link

Closed RuiFilipeCampos closed 5 months ago

RuiFilipeCampos commented 5 months ago

current dataset has 50k entries

RuiFilipeCampos commented 5 months ago

RuiFilipeCampos commented 5 months ago

RuiFilipeCampos commented 5 months ago

I suspect that using a pre-trained tokenizer and pre-calculated positional encoding would help a lot

a simple feed forward on top of the learnable tokenizer is enough to overfit

RuiFilipeCampos commented 5 months ago