rhsimplex / image-match

🎇 Quickly search over billions of images
2.94k stars 405 forks source link

Strange problem searching #64

Open DanielFerreiraJorge opened 7 years ago

DanielFerreiraJorge commented 7 years ago

Hello, I have a weird problem. I have two images, A and B (B is a bigger and with more resolution version of A). If I use ImageSignature to calculate the normalized distance between the two I get 0.314299892917, which is pretty good, showing that they are a match.

Now, here is the problem, if I add image A to elasticsearch using ses.add_image('A.jpg') and then, use ses.search_image('B.jpg'), I get no results. I tried modifying the distance_cutoff to 0.99, and got a bunch of results, BUT these results did not include the A.jpg and all the results in this scenario had a distance of at least 0.60... I KNOW image A is ther because if I ses.search_image('A.jpg') I get a perfect match.

I attached the images. b a

rhsimplex commented 7 years ago

That is strange. I'm on vacation until Apr 5, but please share any other information or if you figure out what the problem is.

Otherwise, I will look into it when I'm back.

fabito commented 7 years ago

Could you create a failing test case reproducing the issue and open a PR?

ecdeveloper commented 7 years ago

I'm noticing a similar issue. Having 2 almost identical images (will attach to that message). After indexing one of them and searching by second image - I'm getting the match, where distance is 0.43 and elasticsearch score is 4.0. I usually get similar numbers for completely different images. Thoughts?

Indexed image: original

Image I search by:

screen1
SthPhoenix commented 7 years ago

Hi! Just stuck at same issue, but looks like I found a solution. In my case i'm indexing photos unmodified, as taken by camera, and trying to search them providing photoshopped images (tweaked histogram and a bit changed white balance) and get distance of 0.69 BUT if I make the following:

img1 = cv2.imread(path1)
img2 = cv2.imread(path2)
sig1 = gis.generate_signature(img1)
sig2 = gis.generate_signature(img2)

I get distance of 0.09. Looks like the reason behind this bug might be different color spaces of source images, or something like that.

BTW: got exactly same issue using imagehash library: hashes of images opened by PIL are completely different, but if I open image with cv2 and than convert it to PIL it gives absolutely same hashes

SthPhoenix commented 7 years ago

Addition to previous post: looks like in my case of unmodified pics from camera they were always horizontal by default, and opencv just handled rotation by itself. Details here

rhsimplex commented 7 years ago

Hey @SthPhoenix, does that mean you don't think it's a color space issue (i.e. is completely explained by the rotation?)

SthPhoenix commented 7 years ago

Yes, it seems so ) I have tried saving image in Adobe RGB without rotation, and it gives a distance of zero with a slightly edited pic in sRGB. Looks like in my case PIL for some reason doesn't handle EXIF rotation by itself.

rhsimplex commented 7 years ago

any suggestions, then? should we switch to cv2?

fabito commented 7 years ago

My 2 cents,

Since OpenCV 3.1, exif orientation is used to rotate images properly. imread should output an image rotated accordingly to the exif orientation.

In the case of color images, the decoded images (imread) will have the channels stored in B G R order.

Hope this info is useful

SthPhoenix commented 7 years ago

OpenCV might be a bit overkill, especially considering installation harder than PIL ) I think a code like this (SO post):

   import Image, ExifTags

    try :
        image=Image.open(os.path.join(path, fileName))
        for orientation in ExifTags.TAGS.keys() : 
            if ExifTags.TAGS[orientation]=='Orientation' : break 
        exif=dict(image._getexif().items())

        if   exif[orientation] == 3 : 
            image=image.rotate(180, expand=True)
        elif exif[orientation] == 6 : 
            image=image.rotate(270, expand=True)
        elif exif[orientation] == 8 : 
            image=image.rotate(90, expand=True)

        image.thumbnail((THUMB_WIDTH , THUMB_HIGHT), Image.ANTIALIAS)
        image.save(os.path.join(path,fileName))

    except:
        traceback.print_exc()

might help without injecting new dependencies

rhsimplex commented 7 years ago

Well I think you know what I'm going to ask for next, @SthPhoenix =)

SthPhoenix commented 7 years ago

I think I do ) Sure, I'll try to figure out how to implement this )

SthPhoenix commented 7 years ago

@rhsimplex, could you try generating signature for image I posted in imagehash issue? For some reason if I am using gis.generate_signature('path/to/image.jpg') I get a weird error:

Traceback (most recent call last):
  File "/codes/ImSearch/imageMeta.py", line 153, in <module>
    s1 = gis.generate_signature(im1)
  File "/usr/local/lib/python2.7/dist-packages/goldberg.py", line 166, in generate_signature
    fix_ratio=self.fix_ratio)
  File "/usr/local/lib/python2.7/dist-packages/image_match/goldberg.py", line 287, in crop_image
    rw = np.cumsum(np.sum(np.abs(np.diff(image, axis=1)), axis=1))
  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1761, in diff
    slice1[axis] = slice(1, None)
IndexError: list assignment index out of range

Excuse me for off-topic, I'm not sure if it's an image-match issue, or something on my side, related to my environment.

rhsimplex commented 7 years ago

I get the same error:

In [3]: im.generate_signature('https://cloud.githubusercontent.com/assets/17834919/25771654/4aa9732c-3260-11e7-8370-8fcc62418
   ...: 677.JPG')
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-3-01518c70844d> in <module>()
----> 1 im.generate_signature('https://cloud.githubusercontent.com/assets/17834919/25771654/4aa9732c-3260-11e7-8370-8fcc62418
677.JPG')

/home/ryan/image-match/image_match/goldberg.py in generate_signature(self, path_or_image, bytestream)
    164                                            lower_percentile=self.lower_percentile,
    165                                            upper_percentile=self.upper_percentile,
--> 166                                            fix_ratio=self.fix_ratio)
    167         else:
    168             image_limits = None

/home/ryan/image-match/image_match/goldberg.py in crop_image(image, lower_percentile, upper_percentile, fix_ratio)
    282         """
    283         # row-wise differences
--> 284         rw = np.cumsum(np.sum(np.abs(np.diff(image, axis=1)), axis=1))
    285         # column-wise differences
    286         cw = np.cumsum(np.sum(np.abs(np.diff(image, axis=0)), axis=0))

/home/ryan/image-match/env/lib/python3.5/site-packages/numpy/lib/function_base.py in diff(a, n, axis)
   1759     slice1 = [slice(None)]*nd
   1760     slice2 = [slice(None)]*nd
-> 1761     slice1[axis] = slice(1, None)
   1762     slice2[axis] = slice(None, -1)
   1763     slice1 = tuple(slice1)

IndexError: list assignment index out of range