Digital-Defiance / nlp-metaformer

An ablation study on the transformer network for Natural Language Processing
3 stars 0 forks source link

context window causes bloated gpu memory footprint #24

Closed RuiFilipeCampos closed 8 months ago

RuiFilipeCampos commented 8 months ago

currently stuck at batch size of 16

this might not be acceptable due to the large size of this dataset

RuiFilipeCampos commented 8 months ago

I might need to remove data points to avoid the larger context window

RuiFilipeCampos commented 8 months ago

I'm just gonna slice the batches in the celery process

can even exclude the data points that exceed the context window

plenty of time in between slices being trained