idealo / imagededup

😎 Finding duplicate images made easy!
https://idealo.github.io/imagededup/
Apache License 2.0
5.18k stars 459 forks source link

WARNING: Invalid Image Filename in a directory with writing access. #143

Closed tanaymeh closed 3 years ago

tanaymeh commented 3 years ago

I installed this framework on Kaggle (on a GPU enabled session) using pip install imagededup and after the install was completed, I executed the following piece of code:

from imagededup.methods import CNN

cnn = CNN()
encodings = cnn.encode_images(image_dir="/kaggle/working/")
duplicates = cnn.find_duplicates(encoding_map=encodings)

Note: I read in a closed issue that the above code snippet throws error if we don't have writing access to provided image_dir so I moved all the dataset's images to /kaggle/working/ folder which has writing access

While running this, it gave the following output:

err

Furthermore, I then opened a random image from the same directory using both opencv and PIL and it was working all fine.

I have tried moving images in a different directory, reading them as-is (i.e: from /kaggle/input/dataset), etc but none of the things seem to work. I also downgraded my imagededup to version to 0.2.2 but error still remains.

tanujjain commented 3 years ago

@heytanay Apologies for the delayed response. It's strange that you are able to read images using PIL but not through imagededup since imagededup uses PIL underneath: https://github.com/idealo/imagededup/blob/3465540cc5c8fdf9254aff76069e28641dfc515f/imagededup/utils/image_utils.py#L125

Could you please check again?

tanaymeh commented 3 years ago

@tanujjain Apologies, I forgot to update this issue too. So the code now works (without me needing to move images to /kaggle/working directory) and the only thing that I believe made the difference was a backslash ("/")

Previously, the data path I was using was this: ../input/shopee-product-matching/train_images/ But then I changed it to this: ../input/shopee-product-matching/train_images And the code now works fine; here's a link to the working version.

I suppose there is something that causes the package to throw an error if we add a / at the end of the image path string.