Closed njustesen closed 1 month ago
I also tried copying it to my own OneDrive account but it stops after 1 hour of copying (just 6 GB). Even after paying for 1 TB of space. I don't think OneDrive is very suited for sharing this much data. Could you upload it somewhere else @TeaPearce ?
Hmm, thanks for letting me know and sorry about this issue. Let me have a think about the best way forward...
Hello
Any news on this issue ? I have the same problem. Is it possible to make zips of 100gb ?
Hi, thank you for sharing this great dataset. I got the same problem here. Is there any new sharing method available?
I actually struggled to find somewhere to host datasets of this size, and resorted to using a personal OneDrive. I suppose currently they have to be downloaded by manually selecting chunks of files of small enough size. I realize this is not ideal. If anyone has ideas about another hosting platform I'd be happy to hear.
Maybe huggingface?
OK, enough people complained about this that I finally got around to reuploading the dataset in a structure that should make downloading less painful. dataset_dm_scraped_dust2_tars contains chunks of 200 files together in .tar format.
Hi @TeaPearce still have troubles downloading individual chunks of dataset_dm_scraped_dust2_tars? Is there a better solution, I vote for hosting on huggingface as well. Thank you!
I can't download file beyond 20GB according to the onedrive setting below
What's the best way to download the scraped dataset? I see this when following the link.