idealo / image-super-resolution

🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.
https://idealo.github.io/image-super-resolution/
Apache License 2.0
4.6k stars 758 forks source link

training fails with non-RGB images #66

Open MCOfficer opened 5 years ago

MCOfficer commented 5 years ago

Note: according to CONTRIBUTING, there should be issue templates, but i don't see any... just FYI

This seems to happen to me at random while reading different images. The error is as follows:

Traceback (most recent call last):
  File "/home/florian/Git/Andromeda/train.py", line 56, in <module>
    monitored_metrics={"val_PSNR_Y": "max"}
  File "/home/florian/Git/Andromeda/venv/lib/python3.7/site-packages/ISR/train/trainer.py", line 322, in train
    batch = self.train_dh.get_batch(batch_size, flatness=flatness)
  File "/home/florian/Git/Andromeda/venv/lib/python3.7/site-packages/ISR/utils/datahandler.py", line 181, in get_batch
    batch = self._crop_imgs(img, batch_size, flatness)
  File "/home/florian/Git/Andromeda/venv/lib/python3.7/site-packages/ISR/utils/datahandler.py", line 107, in _crop_imgs
    candidate_crop = imgs['lr'][s['x'][0] : s['x'][1], s['y'][0] : s['y'][1], slice(None)]
IndexError: too many indices for array

i traced this back to here, where imread usually returns 3D arrays, but sometimes also 2D - and those are the times it blows up. https://github.com/idealo/image-super-resolution/blob/98a4875c74818479d2577f9a3b2c403458dfc941/ISR/utils/datahandler.py#L172-L174

When it happens, both img['lr'] and img['hr'] are affected. I counted about a dozen different images before giving up on any sort of correlation - they are all perfectly fine, squared images. To me, it appears to be happening at random.

ISR=2.1 imageio=2.5.0 numpy=1.16.4 Pillow=6.1.0 (in case this is important)

If you need any other info, let me know.

cfrancesco commented 4 years ago

Are the images you're loading all RGB (3 channels)? I have not seen this before.

And thanks for the suggestion, there should definitely be an issue template.

MCOfficer commented 4 years ago

Are the images you're loading all RGB (3 channels)? I have not seen this before.

And thanks for the suggestion, there should definitely be an issue template.

i'm not on my dev machine atm, so i can't say that they're all exactly RGB, but there definitely were some coloured ones, so they're not all B/W if that's what you're thinking. i can upload some samples later, but as i said, i couldn't find any correlation, to the point where no two errors are from the same image.

the only correlation is that every time (at least when i debugged it for a closer look), both img["lr"] and img["hr"] were two dimensional, even though they're different files.

MCOfficer commented 4 years ago

I had another look at this today, and it turns out you were right all along. The images in question were all black and white!

I'm not sure why i didn't notice this last time - my images are enumerated, so my best guess is that i messed up and examined 0989.jpg instead of 0898.jpg or something like that. Sorry for the mess.

I'm now a couple minutes into training after removing all black and white images. sidenote: it took me 20m to figure out this command, so in case anyone has the same issue, do not go down the path of find -exec, use xargs:

find *.jpg | xargs identify | awk '{print $1, $6}' | grep Gray | awk '{print $1}' | xargs rm

it's not particularly elegant and you shouldn't have images named "Gray", but other than that, it gets the job done.

Anyways. This issue seems to be resolved, but i still feel like ISR should handle this case somehow. perhaps a simple warning like "Encountered non-RGB image, skipping" would suffice?

cfrancesco commented 4 years ago

@MCOfficer you are correct, there should be a channel consistency check for the training set, at the moment the only checks are at prediction time. Thank you for you suggestion!

MCOfficer commented 4 years ago

@MCOfficer you are correct, there should be a channel consistency check for the training set, at the moment the only checks are at prediction time. Thank you for you suggestion!

one more suggestion, i recently happened to train with .png images which can hold four channels. Perhaps you should also check for those, and strip away the alpha channel as needed.