EPFL-VILAB / omnidata

A Scalable Pipeline for Making Steerable Multi-Task Mid-Level Vision Datasets from 3D Scans [ICCV 2021]
Other
412 stars 53 forks source link

Possible Corrupted Taskonomy Image #7

Closed CJMenart closed 2 years ago

CJMenart commented 2 years ago

libpng is complaining that the checksum is wrong for this image from the taskonony (tiny) subset:

rgb/taskonomy/newfields/point_1070_view_8_domain_rgb.png

Process: I downloaded the rgb of the section described above using omnitools.download and tried to open the images using libpng, resulting in the "libpng error: IDAT: CRC error" for the above file. I assumed that I had corrupted the file myself, so I deleted and re-downloaded the segment, but I got the same error on the same image the next time around.

alexsax commented 2 years ago

Hi! Yes, there are probably some corrupted images from the original Taskonomy paper.

TLDR: I think you can safely ignore/remove them. The provided dataloaders should automatically filter out (building,point,view) triplets that don't have labels in whatever modalities you're using.

Long answer: Yes, this means that sometimes people will be using slightly different splits of the starter data depending on the modalities they choose, but the difference is usually something like 0.05% of images, so pretty small. We haven't quantified the impact of the slightly different splits, but since the corruptions happen with random images/modalities rather than systematically impact should be pretty unbiased and you could (more than) compensate by just using a larger split.

CJMenart commented 2 years ago

Good enough!