rhsimplex / image-match

🎇 Quickly search over billions of images
2.94k stars 405 forks source link

Signature generator fails on seemingly normal image #76

Open rhsimplex opened 7 years ago

rhsimplex commented 7 years ago

Causes an error (see #64 )

rhsimplex commented 7 years ago

@SthPhoenix maybe this is not a real jpeg? Usually the shape property is rows x columns x color depth

In [1]: from skimage.io import imread

In [2]: r = imread('https://cloud.githubusercontent.com/assets/17834919/25771654/4aa9732c-3260-11e7-8370-8fcc62418677.JPG')

In [3]: r.shape
Out[3]: (3,)

In [4]: r[0].shape
Out[4]: (4000, 6000, 3)

In [5]: r[1].shape
Out[5]: ()

In [6]: r[2].shape
Out[6]: ()
SthPhoenix commented 7 years ago

This image was taken by Nikon D5300, I have tested other images taken by it, same result. Interestingly, though, NEF(RAW) version of this image works fine

SthPhoenix commented 7 years ago

Trying a dirty fix like:

if len(image.shape)<2:
    image = image[0]

in preprocess_image() in goldberg.py gives no results. Program runs without errors, but with warnings:

/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2889: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/usr/local/lib/python2.7/dist-packages/numpy/core/_methods.py:80: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:565: RuntimeWarning: invalid value encountered in less
  mask = np.abs(difference_array) < identical_tolerance
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:573: RuntimeWarning: invalid value encountered in greater
  positive_cutoffs = np.percentile(difference_array[difference_array > 0.],
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:575: RuntimeWarning: invalid value encountered in less
  negative_cutoffs = np.percentile(difference_array[difference_array < 0.],
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:581: RuntimeWarning: invalid value encountered in greater_equal
  (difference_array <= interval[1])] = level + 1
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:581: RuntimeWarning: invalid value encountered in less_equal
  (difference_array <= interval[1])] = level + 1
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:586: RuntimeWarning: invalid value encountered in less_equal
  (difference_array >= interval[1])] = -(level + 1)
/home/SthPhoenix/codes/ImSearch/image_match/goldberg.py:586: RuntimeWarning: invalid value encountered in greater_equal
  (difference_array >= interval[1])] = -(level + 1)

And a distance in range of 0.7-0.9, depending on rotation of second test image

SthPhoenix commented 7 years ago

Also interestingly enough, arrays of image[0] of original image, and image of it saved in photoshop differs drastically: In first case:

[[[ 42  43  45]
  [ 40  41  43]
  [ 39  40  42]
  ..., 
  [ 82 107 164]
  [ 82 107 164]
  [ 81 106 163]]

 [[ 41  42  44]
  [ 39  40  42]
  [ 38  39  41]
  ..., 
  [ 82 107 164]
  [ 82 107 164]
  [ 81 106 163]]

 [[ 40  41  43]
  [ 39  40  42]
  [ 38  39  41]
  ..., 
  [ 82 107 164]
  [ 82 107 164]
  [ 81 106 163]]

 ..., 
 [[ 87  89  42]
  [ 78  80  33]
  [ 77  78  34]
  ..., 
  [ 85 111 172]
  [ 85 111 172]
  [ 86 112 173]]

 [[ 91  93  43]
  [ 80  82  33]
  [ 76  78  29]
  ..., 
  [ 85 111 172]
  [ 86 112 173]
  [ 86 112 173]]

 [[ 93  96  43]
  [ 83  85  35]
  [ 78  80  30]
  ..., 
  [ 85 111 172]
  [ 86 112 173]
  [ 86 112 173]]]

In second case:


[[ 0.16835961  0.16051647  0.1565949  ...,  0.41489098  0.41489098
   0.41096941]
 [ 0.16443804  0.1565949   0.15267333 ...,  0.41489098  0.41489098
   0.41096941]
 [ 0.16051647  0.1565949   0.15267333 ...,  0.41489098  0.41489098
   0.41096941]
 ..., 
 [ 0.33406392  0.2987698   0.2965298  ...,  0.4308749   0.43479647
   0.43479647]
 [ 0.34498039  0.30604745  0.29036118 ...,  0.4308749   0.43479647
   0.43871804]
 [ 0.3589851   0.31360784  0.29820431 ...,  0.4308749   0.43479647
   0.43871804]]
SthPhoenix commented 7 years ago

BTW, are there any specific reasons to use skimage.imread in

elif type(image_or_path) is str:
            image = imread(image_or_path, as_grey=True)

beside it accepting urls as parameter?

rhsimplex commented 7 years ago

Looks like RGB (dimensions l x w x 3 0-255) vs greyscale (l x w 0.0-1.0). They're all greyscaled by image-match anyway.

BTW, are there any specific reasons to use skimage.imread in...beside it accepting urls as parameter?

Nope, you got it. The image loader is kind of a mess, we hastily improved it when doing our image crawl. That's why it supports obscure formats like MPO. Anyway, could definitely use some cleanup, like using PIL or cv2 and getting rid of the skimage dependency completely.