XiYe20 / VPTR

The repository for paper VPTR: Efficient Transformers for Video Prediction
MIT License
88 stars 19 forks source link

Checkpoint Size #6

Closed Asma-Alkhaldi closed 1 year ago

Asma-Alkhaldi commented 1 year ago

I'm trying to reproduce the results, and I noticed in Stage1: train_AutoEncoder.py that the output checkpoint size is getting bigger and bigger in each epoch.

Epoch1 >> 615 MB Epoch2 >> 1.2 GB Epoch3 >> 2.5 GB Epoch4 >> 4.9 GB Epoch10 >> 160.3 GB

Then I couldn't complete the training because it consumes all the available storage. I'm wondering why the checkpoint size keeps changing and why it is huge ?!

XiYe20 commented 1 year ago

Hi, the size of checkpoint should be constant. Please see the first issue, https://github.com/XiYe20/VPTR/issues/1, where I showed that each checkpoint size is the same. It is hard for me to help you to find the problem without more detailed information.

XiYe20 commented 1 year ago

https://github.com/XiYe20/VPTR/issues/1#issuecomment-1380294936