HighwayWu / LASTED

Synthetic Image Detection
MIT License
51 stars 2 forks source link

Request for an Alternative Download Method for the Training Dataset #4

Closed Lemma1727 closed 8 months ago

Lemma1727 commented 1 year ago

Hi!

I would like to express my interest in your project and my desire to download the training dataset. However, I have encountered a significant issue with the current download method via Baidu accounts.

The download speed I am experiencing is quite slow, averaging around 100 KB/s, which makes it impractical to obtain the dataset in a reasonable timeframe. It would take me more than a day to download a single dataset.

Considering the challenges I am facing with the current method, I wanted to kindly inquire if there might be an alternative means of making the training dataset available. Your assistance or guidance on this matter would be greatly appreciated.

Thanks!

HighwayWu commented 1 year ago

Thanks for your interest and sorry for the inconvenience with the download method. We'll try to upload the "small" ones (RealPhoto, SyntheticPhoto, SyntheticPainting, each of ~10GB) of the train dataset on the Google Drive these days. And for the largest RealPainting dataset (Danbooru2021, ~90GB), it's suggest downloading directly from the source (https://www.gwern.net/Danbooru2021#download).

HighwayWu commented 1 year ago

The "RealPhoto", "SyntheticPhoto", "SyntheticPainting" (each of ~10GB) subsets of the train dataset can be downloaded from: https://drive.google.com/drive/folders/1lPuJjUpi5QwhkBlUaphRxC2HrBNnU8Pc?usp=sharing

Lemma1727 commented 1 year ago

I appreciate your prompt feedback. It's no problem to download the datasets from Google Drive, but I need a password to unzip them. Where I can find the password? Thank you.

HighwayWu commented 1 year ago

Please fill this license form and the password will be sent via emails: https://docs.google.com/forms/d/1CZAIZEEugoGiTw8auyiU2LM8qjW0LelAWq2fuuzYAq8/

Woodyet commented 11 months ago

Hi sorry can you just confirm which parts of the Danbooru2021 dataset you used? The dataset is huge 1.9TB but your subset is only 90GB?

HighwayWu commented 10 months ago

Hi sorry can you just confirm which parts of the Danbooru2021 dataset you used? The dataset is huge 1.9TB but your subset is only 90GB?

Sorry for the late reply. I just randomly sampled some images from Danbooru2021, rather than the whole dataset.