Digital-Defiance / nlp-metaformer

An ablation study on the transformer network for Natural Language Processing
3 stars 0 forks source link

redis buffer limit #31

Closed RuiFilipeCampos closed 8 months ago

RuiFilipeCampos commented 8 months ago

might need to patch this by passing files via shared volume

will see if that can be avoided

RuiFilipeCampos commented 8 months ago

i've increased the available memory in the redis process

all these three processes should actually have the same amount of memory available to them since they all hold the whole slice at some point

RuiFilipeCampos commented 8 months ago

2024-02-07-201737_831x808_scrot

seems to be performing quite well, I'm just gonna wait and see if it breaks again, hopefully mem increase was enough

RuiFilipeCampos commented 8 months ago

i changed the result backend to sqlite

not too happy about it since I wanted to avoid using the disk

will look into this redis issue, but it's not critical at the moment as long as the sqlite result backend works as intended

RuiFilipeCampos commented 8 months ago

didn't work, tomorrow I'll setup the redis.conf file to handle the buffer limit thing

the model is converging quite well tho

newplot(4)

this is train loss but they're all unseen batches, so it still indicates that it is generalizing, the amount of steps in an epoch goes up to about 12.5k (there's 3M data points in the train split)

RuiFilipeCampos commented 8 months ago

https://github.com/redis/redis/issues/11621

It seems that the client is not emptying db2 fast enough.

RuiFilipeCampos commented 8 months ago

Seems like an open issue in celery

https://github.com/celery/celery/issues/4983#issuecomment-518302708

RuiFilipeCampos commented 8 months ago

this has been resolved

it was also a symptom of a deeper issue