Open topwasu opened 4 years ago
Have we tried, instead of setting the model variable to none, actually calling delete? i.e.,
del model
Yes, I've tried that. Strangely, it didn't work.
Perhaps this is related: https://github.com/BatsResearch/labelmodels/issues/4
That could be the cause of OOM error!
During the training, labelmodel takes up a lot of memory causing an out-of-memory- error, and reducing the batch size doesn't help. Apart from that, labelmodel seems to still take up memory even at the start of a new checkpoint. For now, in the master branch, the labelmodel is replaced with majority vote. In memory-debug branch, it is still using labelmodel, and it contains different memory profilers to help debug this bug, so debugging this error should probably happen in this branch.