mmp2 / megaman

megaman: Manifold Learning for Millions of Points
http://mmp2.github.io/megaman/
BSD 2-Clause "Simplified" License
322 stars 68 forks source link

Cyflann - ValueError: Buffer dtype mismatch, expected 'double' but got 'float' #77

Closed SnowRipple closed 7 years ago

SnowRipple commented 7 years ago

Hi there! I am trying to use Spectral embedding "predict" function. I have some training data I want to use to create embedding and some testing data I would like to project onto this new embedding.

radius = 0.5 adjacency_method = 'cyflann' adjacency_kwds = {'radius':radius} affinity_method = 'gaussian' affinity_kwds = {'radius':radius} laplacian_method = 'symmetricnormalized' laplacian_kwds = {'scaling_epps':radius} color = 'b'

geom = Geometry(adjacency_kwds=adjacency_kwds, affinity_kwds=affinity_kwds) dr_technique = SpectralEmbedding(n_components=target_dim, eigen_solver='auto',geom=geom, drop_first=False) # use 3 for spectral dr_technique.fit(data) final_array = dr_technique.predict(data) Produces:

File "/usr/local/lib/python2.7/dist-packages/megaman/embedding/spectral_embedding.py", line 428, in predict X_test,adjacency_kwds) File "/usr/local/lib/python2.7/dist-packages/megaman/geometry/complete_adjacency_matrix.py", line 12, in complete_adjacency_matrix train_index = Cyflann.build_index(Xtrain) File "/usr/local/lib/python2.7/dist-packages/megaman/geometry/adjacency.py", line 113, in build_index return self._get_built_index(X) File "/usr/local/lib/python2.7/dist-packages/megaman/geometry/adjacency.py", line 105, in _get_built_index **(self.cyflann_kwds or {})) File "index.pyx", line 18, in megaman.geometry.cyflann.index.Index.__cinit__ ValueError: Buffer dtype mismatch, expected 'double' but got 'float'

What am I doing wrong?

Can projecting test data on already created embedding be done using other techniques in the megaman package like LLE, LTSA or Isomap?

Many Thanks!

SnowRipple commented 7 years ago

ok, casting data to numpy float64 seems to fix the issue.

I will close this issuse but can someone answer my question considering projecting test data using other techniques please?

jmcq89 commented 7 years ago

Hi Piotr,

Thanks for catching this we will make sure the numpy float64 bug is addressed in the next release.

Currently, due to the way that the Nystrom Extension works, we only have Spectral Embedding set up for .predict() as the other methods would require significantly more work. It is future work for megaman to add scalable .predict() methods for the other embeddings. If there is one in particular that interests you most we might be able to put that specific method higher on the pipeline.

Cheers, James

On Tue, Sep 5, 2017 at 6:01 AM, Piotr Chudzik notifications@github.com wrote:

ok, casting data to numpy float64 seems to fix the issue.

I will close this issuse but can someone answer my question considering projecting test data using other techniques please?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mmp2/megaman/issues/77#issuecomment-327168051, or mute the thread https://github.com/notifications/unsubscribe-auth/AGLYjg0ilitXulx-5diCaX9tVX1EtKouks5sfUYvgaJpZM4PM3d4 .

SnowRipple commented 7 years ago

Thanks James! It's great news that such improvements are on the agenda!

Personally I was wondering if it is even possible to provide "predict" method for all other methods.E.g. in LLE don't you have to recalculate the whole embedding for "out-of-sample" data?

Anyways if you could give LLE higher priority that would be awesome!

Big Fan of your work guys!:)