Open unrealMJ opened 10 months ago
Also, the .parquet has 2.86M images, while the mapping.json has 1M images, it seems that is a subset of .parquet. I'd like to ask for the details about .parquet, i think is a subset of laion-5b, how do you get it?
Hi, @unrealMJ ! Thank you for your focus. You may use python utils/download_data.py
to download all images. The .parquet has provides images in Laion-Aesthetic since we have a different order with the original Laion-Aesthetic dataset as mentioned in issue4.
Hi, thanks for your reply. The Laion2b-en-aesthetic in huggingface has 52.1M rows, but the .parquet you provided only has 2.86M rows, i'd like to ask the difference.
The .parquet we provide is a subset of Laion2b-en-aesthetic, filtering out the part with a higher aesthetic score.
Hi,
I have already downloaded the full laion-5b dataset. How can i use your .parquet and mapping file to get corresponding image.