pixelogik / NearPy

Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive hashes.
MIT License
763 stars 151 forks source link

scipy.sparse.issparse(v) object has no attribute 'sparse' #18

Closed willard-yuan closed 10 years ago

willard-yuan commented 10 years ago

Hi, pixelogik. I'm very interested in the Local Sensitive Hashing and plan to use it for image retrieval. I install the NearPy and the redis packages. Then I test the following example:

import numpy
from nearpy import Engine
from nearpy.hashes import RandomBinaryProjections

# Dimension of our vector space
dimension = 500

# Create a random binary hash with 10 bits
rbp = RandomBinaryProjections('rbp', 10)

# Create engine with pipeline configuration
engine = Engine(dimension, lshashes=[rbp])

# Index 1000000 random vectors (set their data to a unique string)
for index in range(100000):
    v = numpy.random.randn(dimension)
    engine.store_vector(v, 'data_%d' % index)

# Create random query vector
query = numpy.random.randn(dimension)

# Get nearest neighbours
N = engine.neighbours(query)

It throws out the error:

C:\Python27\python.exe D:/python/projects/lsh/lsh-image-search.py
Traceback (most recent call last):
  File "D:/python/projects/lsh/lsh-image-search.py", line 17, in <module>
    engine.store_vector(v, 'data_%d' % index)
  File "C:\Python27\lib\site-packages\nearpy\engine.py", line 79, in store_vector
    for bucket_key in lshash.hash_vector(v):
  File "C:\Python27\lib\site-packages\nearpy\hashes\randombinaryprojections.py", line 65, in hash_vector
    if scipy.sparse.issparse(v):
AttributeError: 'module' object has no attribute 'sparse'

This happens because the scipy module doesn't have any attribute named sparse. That attribute only gets defined when you import scipy.sparse.

So in the randombinaryprojections.py, it's good to import scipy.sparse. After I do this, it succeed.

pixelogik commented 10 years ago

Hi there :)

Thanks for the hint, I added the imports to all hashes with the last commit.

Funny, I started with the library because of image retrieval, but never did the actual application. Good luck and fun with that!

Cheers