gnosisyuw / CrevNet-Traffic4cast

Apache License 2.0
27 stars 6 forks source link

About MovingMnist #4

Closed Yeoninm closed 3 years ago

Yeoninm commented 3 years ago

Hi

I think this is a great paper,and I use the code provided on ICLR page for MovingMNIST with same parameters as the paper, but the performance is not very good,Is there something wrong with me?

A Z Thanks!

Yeoninm commented 3 years ago

p

gnosisyuw commented 3 years ago

Can you share the code you are using? BTW, do you make any changes on the code from ICLR? Please try default setting. The Conv3D version is sensitive to the change of learning rate.

Yeoninm commented 3 years ago

Can you share the code you are using? BTW, do you make any changes on the code from ICLR? Please try default setting. The Conv3D version is sensitive to the change of learning rate.

I modified the learning rate to 0.0002......,Okay Thanks you very much! i will try default setting.

Yeoninm commented 3 years ago

Can you share the code you are using? BTW, do you make any changes on the code from ICLR? Please try default setting. The Conv3D version is sensitive to the change of learning rate.

but is too slow, each iteration will take 110 minute on Quadro RTX 6000...

gnosisyuw commented 3 years ago

Can you share the code you are using? BTW, do you make any changes on the code from ICLR? Please try default setting. The Conv3D version is sensitive to the change of learning rate.

but is too slow, each iteration will take 110 minute on Quadro RTX 6000...

I tried to rerun the code and found that using newer version of pytorch would significantly change the results. I don't know why but please use pytorch 1.0 or pytorch 1.1. If you still cannot replicate the results on Moving MNIST, I will share the weights with you privately.

The generated sequences of first epoch should look like below.

image

Yeoninm commented 3 years ago

Can you share the code you are using? BTW, do you make any changes on the code from ICLR? Please try default setting. The Conv3D version is sensitive to the change of learning rate.

but is too slow, each iteration will take 110 minute on Quadro RTX 6000...

I tried to rerun the code and found that using newer version of pytorch would significantly change the results. I don't know why but please use pytorch 1.0 or pytorch 1.1. If you still cannot replicate the results on Moving MNIST, I will share the weights with you privately.

The generated sequences of first epoch should look like below.

image

I tried many times but still can't replicate the results can i get your weights? thank for your time

email: m01025190524@gmail.com

gnosisyuw commented 3 years ago

Can you share the code you are using? BTW, do you make any changes on the code from ICLR? Please try default setting. The Conv3D version is sensitive to the change of learning rate.

but is too slow, each iteration will take 110 minute on Quadro RTX 6000...

I tried to rerun the code and found that using newer version of pytorch would significantly change the results. I don't know why but please use pytorch 1.0 or pytorch 1.1. If you still cannot replicate the results on Moving MNIST, I will share the weights with you privately. The generated sequences of first epoch should look like below. image

I tried many times but still can't replicate the results can i get your weights? thank for your time

email: m01025190524@gmail.com

Can you be more specific about what happened after you tried default setting on pytorch 1.1 or 1.0? Although it is OK for me to share the weights, it makes no sense for me that you are still having replication issue now. image image As I said, I re-ran the code on Ubuntu 16 on both TITIAN V and 2080 ti, SSIM can reach 0.865 after 3 epoches and now SSIM reaches 0.928 after 35 epoches...It would be appreciated if you could help me find any unexpected reproduction issue.

mquanbui commented 3 years ago

Hi @GnosisYu, I want to say that I appreciate the paper and your effort to maintain this repo and help with reproducing the results. Originally I had a similar issue to @Yeoninm, but now I'm able to reproduce the results more or less (with pytorch 1.1). I haven't completed the training yet, since it indeed takes quite some time. I got the following results after 4 epochs

[00] mse loss: 0.0044574| ssim loss: 0.7772918(0) [01] mse loss: 0.0015796| ssim loss: 0.8402989(80000) [02] mse loss: 0.0012285| ssim loss: 0.8477553(160000) [03] mse loss: 0.0010553| ssim loss: 0.8458577(240000)

using the default options of your code submitted to ICLR, i.e.

_Namespace(batch_size=16, beta=0.0001, beta1=0.9, channels=1, data_root='data', data_threads=5, data_type='sequence', dataset='smmnist', epoch_size=5000, image_width=64, log_dir='logs/smmnist-2/model_mnist=layers_8=seq_len_18=batch_size_16', lr=0.0005, max_step=20, model='crevnet', model_dir='', n_eval=18, n_future=10, n_past=8, name='', niter=60, num_digits=2, optimizer='adam', predictor_rnn_layers=8, rnnsize=32, seed=1)

The SSIM for epoch 03 isn't quite as high as your run above but the training loss is similar, so it gives me some confidence that the result is correct.

However, I do have a question regarding images with only 1 channel. I notice that you generate the moving MNIST dataset on the fly and the images has 3 channels. I tried it on the original dataset downloaded directly from here with the options above, but the result isn't as good. I was only able to get SSIM of around 0.81 even after 60 epochs. Given that the method splits the images channelwise for the two-way autoencoder, can you comment on how I should use this approach for single-channel images?

Again, thanks a lot for your time. I really look forward to using this method for my science application.

gnosisyuw commented 3 years ago

Hi @GnosisYu, I want to say that I appreciate the paper and your effort to maintain this repo and help with reproducing the results. Originally I had a similar issue to @Yeoninm, but now I'm able to reproduce the results more or less (with pytorch 1.1). I haven't completed the training yet, since it indeed takes quite some time. I got the following results after 4 epochs

[00] mse loss: 0.0044574| ssim loss: 0.7772918(0) [01] mse loss: 0.0015796| ssim loss: 0.8402989(80000) [02] mse loss: 0.0012285| ssim loss: 0.8477553(160000) [03] mse loss: 0.0010553| ssim loss: 0.8458577(240000)

using the default options of your code submitted to ICLR, i.e.

_Namespace(batch_size=16, beta=0.0001, beta1=0.9, channels=1, data_root='data', data_threads=5, data_type='sequence', dataset='smmnist', epoch_size=5000, image_width=64, log_dir='logs/smmnist-2/model_mnist=layers_8=seq_len_18=batch_size_16', lr=0.0005, max_step=20, model='crevnet', model_dir='', n_eval=18, n_future=10, n_past=8, name='', niter=60, num_digits=2, optimizer='adam', predictor_rnn_layers=8, rnnsize=32, seed=1)

The SSIM for epoch 03 isn't quite as high as your run above but the training loss is similar, so it gives me some confidence that the result is correct.

However, I do have a question regarding images with only 1 channel. I notice that you generate the moving MNIST dataset on the fly and the images has 3 channels. I tried it on the original dataset downloaded directly from here with the options above, but the result isn't as good. I was only able to get SSIM of around 0.81 even after 60 epochs. Given that the method splits the images channelwise for the two-way autoencoder, can you comment on how I should use this approach for single-channel images?

Again, thanks a lot for your time. I really look forward to using this method for my science application.

Thank you for your interest. I will ask your questions one by one.

  1. Although the Moving MNIST dataset we generate is 3-channel. During the trainning and testing, we actually only use one channel for each frame. There will be additional temporal dimension (T=3) and convoutional 3D filters will be performed on the tensors of size (CxTxHxW). Note that it is also OK to use conv2D counterpart as reported in this repo on Moving MNIST dataset and the conv2D version will be more stable than conv3D w.r.t changes of hyperparameters but it will be more computationally intensive.
  2. I am not sure if there was any architectural modification you made when you used the original small Moving MNIST dataset. But since PredRNN and video pixel network, it has become a routine to use generated data on-the-fly, especially considering all follow-up works including PredRNN++ and E3D-LSTM were done by the same group of PredRNN. image image
  3. When dealing with 1-channel data, the first recommended way is not spliting them but trying to input the same frame into each of two branches. Remember to using pixel shuffling before doing any convolutions. For example, if the input of each frame is 128x128x1, you first reshape it to 32x32x16 using pixel shuffle and copy it as input to both branches. Or you can stack them, for t=1, input will be 128x128x4 by stacking 1st-4th frames.
Mareeta26 commented 2 years ago

@gnosisyuw Hi, Thanks for the repo. I was getting blurry results for Moving MNIST in pytorch 1.9. I tried to use pytorch 1.1 . But it gave me the following error. "RuntimeError: data/MNIST/processed/training.pt is a zip archive (did you mean to use torch.jit.load()?)" Can you please advise how to proceed? @mquanbi , Can you please tell me how you could run in pytorch 1.1?