Mukosame / Zooming-Slow-Mo-CVPR-2020

Fast and Accurate One-Stage Space-Time Video Super-Resolution (accepted in CVPR 2020)
GNU General Public License v3.0
908 stars 165 forks source link

about the training phase #51

Closed yuinnLIU closed 3 years ago

yuinnLIU commented 3 years ago

When I was training the model, I got the following results.

20-12-14 05:05:07.043 - INFO: Model [VideoSRBaseModel] is created. 20-12-14 05:05:07.043 - INFO: Start training from epoch: 0, iter: 0 20-12-14 05:05:16.606 - INFO: Saving the final model. 20-12-14 05:05:16.793 - INFO: End of training.

It seems that the training ended without starting. Does anyone know why? Is it related to the data set? In order to test the code, I reduced the data set a lot.

and the output: 20-12-14 05:05:01.562 - INFO: Random seed: 0 20-12-14 05:05:01.570 - INFO: Temporal augmentation interval list: [1], with random reverse is True. 20-12-14 05:05:01.570 - INFO: Using cache keys: Vimeo7_train_keys.pkl 20-12-14 05:05:01.570 - INFO: Using cache keys - Vimeo7_train_keys.pkl. 20-12-14 05:05:01.571 - INFO: Dataset [Vimeo7Dataset - Vimeo7] is created. 20-12-14 05:05:01.571 - INFO: Number of train images: 3, iters: 1 20-12-14 05:05:01.572 - INFO: Total epochs needed: 600000 for iters 600,000 20-12-14 05:05:06.963 - INFO: Network G structure: DataParallel - LunaTokis, with parameters: 11,102,771

Is there anything wrong with my current output?

I would be grateful if anyone knows a solution.

Mukosame commented 3 years ago

Hi, since it can read the number of images in the training set, the problem could lie in somewhere else. But somehow it doesn't print out the error. So I have some suggestions: try on 1 GPU; change the train function and make it print out some information at each stage to help locate where it fails. Please let me know if you have any more questions!

zhanglibo852 commented 3 years ago

Hello, my problem is the same as the problem you mentioned above. Have you solved it now?

xuejiancai commented 3 years ago

Number of train images: 3, iters: 1 I met the same question when i trained the code.Finally i found error from above information.The number of images is 3 and iters:1.The reason for the error is as follows: https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020/blob/master/codes/data_scripts/create_lmdb_mp.py#L115 Here,change key_set to list.Because we will use it when generate dataset in https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020/blob/master/codes/data/Vimeo7_dataset.py#L121 and #224.

xuejiancai commented 3 years ago

I changed the codes in #121 and #224 to get the data and the length of dataset. 121:key = self.paths_GT['keys'][index] 224:return len(self.paths_GT['keys']) After the modification,I can train the code normally.But I am not sure if this is correct. @Mukosame

Mukosame commented 3 years ago

Thanks @xuejiancai , your solution is correct! Feel free to create a pull request. I'd be happy to merge it ASAP!

xuejiancai commented 3 years ago

Well,I have submitted a pull request and only modified a few lines of code. @Mukosame

Mukosame commented 3 years ago

@xuejiancai Thanks for your contribution! Now it should be merged!

wuwuxuezhi commented 2 years ago

@xuejiancai Thank you for your solution, but I still made error after I changed it like this method, key = self.paths_GT['keys'][index] TypeError: 'set' object is not subscriptable How to solve it? @Mukosame @xuejiancai