Resume network uses more memory than from scratch

cavalleria / cavaface

face recognition training project(pytorch)

MIT License

456 stars 88 forks source link

Closed xsacha closed 4 years ago

xsacha commented 4 years ago

I haven't seen this issue before in similar pytorch training scenarios. I can normally do batch size of 256, but when resuming, I must do 224.

It seems like some memory from loading the resumed model is never freed.

Edit: I resolved the issue by adding in: del(checkpoint)

cavalleria commented 4 years ago

thx for your contribute!!