verlab / StraightToThePoint_CVPR_2020

Original PyTorch implementation of the code for the paper "Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data" at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
GNU General Public License v3.0
8 stars 2 forks source link

running time #2

Closed tingchihc closed 2 years ago

tingchihc commented 2 years ago

Hi,

now, I execute the step about Prepare the data to train the Agent. Does this code need to execute very long time because now I spent one day to run this python file? python download_youcookii_videos.py

thanks,

washingtonsk8 commented 2 years ago

Hi, @ting-chih

Thanks for sharing this issue with us.

No, it should not take that long since the script downloads only a subset (110 videos) of the whole YouCook2 dataset (~2000 videos). It will depend on your connection speed, though. However, I just tried to run the script myself and I realized the videos are taking too long to be downloaded (more than 20 mins per video, ~60KiB/s). I'm not sure if there is something wrong with the YouTube API or the server. We'll investigate and get back to you shortly.

Thank you.

tingchihc commented 2 years ago

Hi, @washingtonsk8

Yes, this is my question. I download from 11am. to now. Now I only have 25 files in training folder and some YouTube links can not use.

washingtonsk8 commented 2 years ago

Hi, @ting-chih

I solved this issue temporally using another downloader API. This is a fork of the original YouTube-DL which worked fine for me. To use it I took the following steps:

Please tell us if that works for you.

PS.: Keep in mind that the downloaded videos' extensions may be different (webm instead of MP4). You may need to use some converter like FFMPEG if needed.

tingchihc commented 2 years ago

Hi, @washingtonsk8

thanks, Now, I have new question in this python file line 10. In /rl_fast_forward/resources/YouCook2/splits/, I do not see test_list.txt. Does this make sense?

washingtonsk8 commented 2 years ago

Hi, @ting-chih

Great!

Thanks for the question. The reason we do not use a test_list.txt is that by the time of the paper submission there was no public test set with the recipe texts available. To get around that, we used the original validation set as a test set to report our results. We found the best set of hyperparameters (learning rate, epsilon, etc.) in a tiny set of training videos during preliminary experiments and we kept these hyperparameters fixed for the final experiments.

I hope it is clear now :-)

tingchihc commented 2 years ago

Hi, @washingtonsk8

thanks, I got it.

where should I add these 2 line codes to train model with main.py? import nltk nltk.download('punkt')

washingtonsk8 commented 2 years ago

Hi, @ting-chih

Open your python environment and run those commands first. It will download the necessary files to your python folder. Then, you can run the "python main.py" command to train it.

Since the "running time" issue is solved, I'll close this issue. If you have more questions or issues to report, please open another issue so that more people can be aware of it, ok?

Thanks!