XiYe20 / VPTR

The repository for paper VPTR: Efficient Transformers for Video Prediction
MIT License
88 stars 19 forks source link

hello! one function is missing #1

Closed xvxv1024 closed 2 years ago

xvxv1024 commented 2 years ago

image

XiYe20 commented 2 years ago

Hi, for train_AutoEncoder.py, it is defined in Line 143: gan_loss = GANLoss('vanilla', target_real_label=1.0, target_fake_label=0.0).to(device)

For train_FAR and train_NAR, the default settings do not use gan loss, so we comment it.

xvxv1024 commented 2 years ago

Thank you for your help! I am a new graduate student, I have another question, why the weight file "data.pkl" is getting bigger and bigger, because I used predRNN before, the size of the file does not change.

XiYe20 commented 2 years ago

You are welcome. I am sorry that I don't know the "data.pkl" file you are talking about, the trained models or checkpoints should be saved as ".tar" file.

xvxv1024 commented 2 years ago

Thank you for your answer. I'm really sorry that I keep bothering you, because your paper is very helpful to me, so I want to study carefully. I found that the file ".tar "was getting bigger, so I decompressed the file, and then found that the file" data.pkl "in the compressed file was getting bigger. The ".tar" file is generated when I execute "pythontrain_AutoEncoder. Py" image image

XiYe20 commented 2 years ago

Hi, I've never encountered your problem, the size of different checkpoint files should be the same across different epochs (see the attached screenshot). The ".tar" files are automatically saved by PyTorch, please read the official documentation for any information about the decompressed files. Screen Shot 2022-05-14 at 10 43 10 PM

Asma-Alkhaldi commented 1 year ago

Hi, I've never encountered your problem, the size of different checkpoint files should be the same across different epochs (see the attached screenshot). The ".tar" files are automatically saved by PyTorch, please read the official documentation for any information about the decompressed files. Screen Shot 2022-05-14 at 10 43 10 PM

Could you please send me your train_AutoEncoder.py class? It's possible that a slight change in the code may have caused the issue we encountered.

XiYe20 commented 1 year ago

Hi, to ensure the reproducibility of the code, the checkpoint save function automatically saves the code files for each epoch: https://github.com/XiYe20/VPTR/blob/b876364ee19100dccde35ef402bcc2fb1930fdf1/utils/train_summary.py#L135. I suspect that your "ckpt_save_dir" is in the same directory as the train_AutoEncoder.py file, which means all the previous checkpoints would also be saved in the following epochs, and the size of checkpoint files are growing larger and larger. Could you please check this? Thank you very much.