Open FabHan opened 8 years ago
Wow, that's painful! I didn't realize or encounter this, because preprocessing is done in Python. I'm using torch load for some files (e.g. mscoco test set files), but there it seemed to work fine, probably no corruptions.
ouch that's painful indeed. I can fix up the torch image loader in the next 7 days (ironically the COCO deadline is tonight). We already have the torch loader do this for Memory images (i.e. look at header, rather than file extension), we haven't done it for disk images.
@FabHan Can you use Imagemagick to convert all png images to jpegs, and then do the work?
I don't know all png images in the dataset. I certainly could write a script to find and convert but well, I just haven't done it yet.
please try my image checker in https://github.com/CDLuminate/cocofetch .
When I'm downloading COCO dataset I find that flicker just returns a png (saying "picture not available anymore") instead of the original jpg when the picture is no longer valid. Download it from mscoco.org directly. And my scripts may help you do that.
See file: check_jpeg.py
I tried to run your pretrained model on COCO validation set. It didn't work and I figured out that some images in COCO are
png
, although they have.jpg
extension. This confuses Torch's image reader:not a JPEG file
.OpenCV
doesn't have this problem because it detects the image format using the header, not the filename.Did you encounter this when working on COCO data? If so, what processing did you do?