Closed CJMenart closed 2 years ago
Hi! Yes, there are probably some corrupted images from the original Taskonomy paper.
TLDR: I think you can safely ignore/remove them. The provided dataloaders should automatically filter out (building,point,view) triplets that don't have labels in whatever modalities you're using.
Long answer: Yes, this means that sometimes people will be using slightly different splits of the starter data depending on the modalities they choose, but the difference is usually something like 0.05% of images, so pretty small. We haven't quantified the impact of the slightly different splits, but since the corruptions happen with random images/modalities rather than systematically impact should be pretty unbiased and you could (more than) compensate by just using a larger split.
Good enough!
libpng is complaining that the checksum is wrong for this image from the taskonony (tiny) subset:
rgb/taskonomy/newfields/point_1070_view_8_domain_rgb.png
Process: I downloaded the rgb of the section described above using omnitools.download and tried to open the images using libpng, resulting in the "libpng error: IDAT: CRC error" for the above file. I assumed that I had corrupted the file myself, so I deleted and re-downloaded the segment, but I got the same error on the same image the next time around.