sangmin-git / LMC-Memory

Official PyTorch implementation of "Video Prediction Recalling Long-term Motion Context via Memory Alignment Learning" (CVPR 2021 Oral)
Other
86 stars 22 forks source link

question about generating training dataset #4

Closed zzwei1 closed 3 years ago

zzwei1 commented 3 years ago

Hi, nice work! I wonna use your LMC-Memory for video prediction and I'm now genrating my own dataset.
I first check the test dataset with your pretrained model on Moving Mnist, setting short_len as 10 and out_len as 10, and here I don't use the parameter "long_len" since I use your pretrained model. Everything is ok now. However, when I try to train the model on my own dataset (my target is taking 5 frames as input and predicting the following 10 frames), I get an error "num_samples should be a positive integer value, but got num_samples=0". I have learned that this error indicates I don't load my dataset correctly. I have checked the code in dataloader.py, and I think the bug comes from the following line: "self.clips += [(video_i, t) for t in range(video_frame_num - seq_len)] if train else [(video_i, t * seq_len) for t in range(video_frame_num // seq_len)]" When training, in dataloader.py line 58, you set parameter "t" chaning in "range(video_frame_num - seq_len)", however, as I stated, I wonna use 5 frames to predict the follwing 10 frames, since I only have 15(5+10) frames in one video, i.e. the parameter "video_frame_num " is 15. But at the same time, the parameter "seq_len" is 15, so the parameter "t" can't get an effective value in "range(0)". The solution I use now is to change the code for t in range(video_frame_num - seq_len) as for t in range(video_frame_num // seq_len), as you do when validating or testing. Is the solution correct for my task? By the way, as for the parameter "lon_len" in my task, does it mean that I can set "lon_len" to any number from 5 to 15 since I have 5 input frames (i.e., short_len=5) and the total number of my frames is 15? Thanks in advance!

sangmin-git commented 3 years ago

(1) Your point is right. I suggest simply replacing range(video_frame_num - seq_len) to range(video_frame_num - seq_len + 1) for generalizability.

(2) Yes, you can choose long_len from 5 to 15 in such case.

Thanks!

zzwei1 commented 3 years ago

(1) Your point is right. I suggest simply replacing range(video_frame_num - seq_len) to range(video_frame_num - seq_len + 1) for generalizability.

(2) Yes, you can choose long_len from 5 to 15 in such case.

Thanks!

Thanks very much !

By the way, I have another 3 small questions,

  1. Is there some considerations about chosing the memory slot size "s" ? In the original paper, "s" is set to 100. Is it correct that, in theory the larger the value of "s", the better the effect?

  2. How to calculate the "Inference Time" in the paper ? Is there some code about it ?

  3. Why the LPIPS, and Inference Time metrics of TrajGRU, CDNA, VPN and PredRNN models are not calculated ? I'm a little confused.

Thanks in advance !

sangmin-git commented 3 years ago
  1. It is recommended to set s considering the the complexity of the dataset. Setting unconditionally large s is not always better.

  2. We used time.time() function to evaluate the inference time. (batch_size=1) Here is the code example.

    time_start = time.time()
    out_pred = pred_model(short_data, None, args.out_len, phase=2)
    time_end = time.time()
    inference_time = time_end - time_start 

    We calculated an inference 1000 times and averaged them.

  3. Such methods are somewhat outdated and their LPIPS scores are not included in papers We calculated inference time on the recent methods which offer official implementation.

zzwei1 commented 3 years ago
  1. It is recommended to set s considering the the complexity of the dataset. Setting unconditionally large s is not always better.
  2. We used time.time() function to evaluate the inference time. (batch_size=1) Here is the code example.
time_start = time.time()
out_pred = pred_model(short_data, None, args.out_len, phase=2)
time_end = time.time()
inference_time = time_end - time_start 

We calculated an inference 1000 times and averaged them.

  1. Such methods are somewhat outdated and their LPIPS scores are not included in papers We calculated inference time on the recent methods which offer official implementation.

Thanks very much !