Closed qiaogh97 closed 3 years ago
Hi, where did you download the parquet from? http://the-eye.eu/public/AI/cah/laion400m-met-release/laion400m-meta/ has laion400m
If you downloaded from 3080.rom1504.fr you probably got a more recent version of the dataset that is indeed much bigger (and not really released yet)
Ah yes I see I left that 3080 link in the readme, i need to fix it :)
Ok, I see
Hi, @rom1504 I download the 32 parquet files and compute the total of url. I find about 26760000 urls in every parquet, and 32*26760000 = 800 million. But you said the number of this dataset is 400m? So what is the difference?