Open LengSicong opened 11 months ago
If you are using the official video2dataset script to download raw videos, YouTube may restrict your request frequency, resulting in too many requests issues. To address this problem, you can consider employing techniques such as setting up IP proxies to alleviate the restrictions. However, when constructing YT-SB-1B, we only made requests to the interface responsible for obtaining storyboard images. Fortunately, this specific interface does not impose restrictions on the number of requests(at least not during our crawling process).
Hi, thanks for your prompt reply. May I know how I can just make requests to the interface responsible for obtaining storyboard images? Since the official instruction given here is using video2dataset for downloading storyboard images.
We use the thumbframes_dl
May I know if the storyboard images downloaded through thumbframes_dl contain the time stamp information, which may be used to construct the interleaved video-text data in the next step?
You can refer to this code , The time intervals of storyboard images are continuous and fixed, and the timestamps can be inferred.
Hello, I meet the same problem ("HTTPError: 429 Client Error: Too Many Requests for url: xxx") when downloading subtitles. Is there any advice?
Certainly. The most widely-used and effective solution is to set up IP proxies. However, this requires purchasing IP proxy services. Another approach is to extend the interval between requests. Adjusting the request frequency might help alleviate the issue
Thanks for your work! I try to use video2dataset to download YT-Temporal-1B. However, it reports too many requests while downloading... Could you give me some advice on how to fix this problem?