Open linhaojia13 opened 5 months ago
I run this command to download webvid-2m:
video2dataset --url_list="results_2M_train.csv" \ --input_format="csv" \ --output-format="webdataset" \ --output_folder="results_2M_train" \ --url_col="contentUrl" \ --caption_col="name" \ --save_additional_columns='[videoid,page_idx,page_dir,duration]' \ --enable_wandb=True \ --config=default \
However, I find some of these videos are failed to downloaded, as shown in the xxxxx_stats.json:
xxxxx_stats.json
{ "count": 1000, "successes": 984, "failed_to_download": 16, "failed_to_subsample": 0, "duration": 402.9706723690033, "bytes_downloaded": 2114718582, "start_time": 1713343040.4027474, "end_time": 1713343443.3734198, "status_dict": { "success": 984, "HTTPSConnectionPool(host='ak.picdn.net', port=443): Read timed out.": 16 } }
How could I use video2dataset to re-download these part files that contain failed_to_download?
video2dataset
failed_to_download
I run this command to download webvid-2m:
However, I find some of these videos are failed to downloaded, as shown in the
xxxxx_stats.json
:How could I use
video2dataset
to re-download these part files that containfailed_to_download
?