The dataset download is slow and often gets stuck

pritamqu / AVCAffe

[AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work

https://www.pritamsarkar.com/

Other

17 stars 1 forks source link

The dataset download is slow and often gets stuck #3

Closed youcaiSUN closed 1 year ago

youcaiSUN commented 1 year ago

Hi,

When I use the official code to download the total 1918 files, request.get() often gets stuck, probably due to too frequent requests from the same ip address.

Could you package the whole dataset to a few zip files, so that we can download them manually?

Thanks very much!

pritamqu commented 1 year ago

Hi - sorry to hear about the issue. But, we are not planning to make any changes to the directory structure right away. It should not be too slow, based on our initial testing, but if the problem still persists, please do let us know, we will do something to resolve this.

youcaiSUN commented 1 year ago

Hi，

Finally, I finished the dataset download. However, when I check the integrity of downloaded files, I found that task 3 videos of 4 subjects did not exisit (also the corresponding short video segments).

The total number of videos and segments I downloaded are 950 (vs. 954 in the label file) and 58112 (vs. 58118 in paper).

Could you check what's wrong with this? Thanks very much！

pritamqu commented 1 year ago

the following pairs aiim101, aiim102 and aiim89, aiim90 accidentally skipped task 3. so the label for those segments should be discarded. If you use the dataloader we released it will take care of this issue.
the number of files downloaded is correct. we cleaned a few files with some issues but missed updating the paper.

would you mind sharing how much time did it take for you to download the entire dataset? please let me know if you have any more questions.

youcaiSUN commented 1 year ago

OK, I got it, thanks very much!

It roughly 6 days because I need to rerun the code everytime it stucks.

However, the real time for download should be much less than that. I find that it usually takes about 15 seconds to download an item, total time would be close to 8 hours if everything works fine.

pritamqu commented 1 year ago

Hi, thanks for your response. yeah, 6 days is really bad. do you know why it was getting stuck, is it any local internet connection issue or particularly facing when communicating with the dataset repository?

Thanks.

youcaiSUN commented 1 year ago

Sorry, I do not know the reason. I do not think it's due to my local internet connection issue. My solution is to kill the program and rerun it. Each time it can download a few samples (about 5-50).

youcaiSUN commented 1 year ago

Hi, I also want to ask when the face crops will be available? Thanks!

pritamqu commented 1 year ago

we will try to upload that as soon as possible. Thanks, @youcaiSUN

youcaiSUN commented 1 year ago

ok!

pritamqu commented 1 year ago

I checked with others and the downloading speed seems normal/usual. As you have also been able to download the dataset, hence closing the issue.