Closed gpantaz closed 1 year ago
I meet the same problem, have you ever solved this issue? Could you please tell me how to overcome this?
Hello, sadly no. I was allocating my resources on different experiments to speed up the process. I ended up running 1 experiment at a time :/
try to add tokenizer_pool.close()
at the end of the function train_scst
Hello,
Many thanks for releasing the repo. I am trying to train a model on a custom variation of MSCOCO though I keep the train/test/valid sizes equal to the Karpathy split. I have no issue training a model without RL optimization. However, I have noticed that during each epoch in RL optimization the required memory increases. I am training the model on RTX-2080. Each epoch lasts approximately 3-4 hours and occasionally run out of memory. I tried to see if there are any additional accumulated allocations from epoch to epoch. Is this expected?
Thank you :)