lmcinnes / pynndescent

A Python nearest neighbor descent for approximate nearest neighbors
BSD 2-Clause "Simplified" License
901 stars 105 forks source link

Overhead with Wrapping pynndescent in another class #132

Closed kevinlu1248 closed 3 years ago

kevinlu1248 commented 3 years ago

Hello, there seems to be a lot of overhead when I wrap pynndescent in another class, as seen at https://colab.research.google.com/drive/1ekwjaNFHgtBMzC3rs17I9D7OagWhL35f?usp=sharing. Training and preparing takes about the same amount of time but querying is over 300 times slower (305ms vs 0.812ms).

gclen commented 3 years ago

I think this is because in the wrapped class you were using the full test set

wmodel.query(fmnist_test)

while using pynndescent directly you were querying a subset of data

neighbors = index.query(fmnist_test[:10])

If you use the full test set on both you see similar times.

kevinlu1248 commented 3 years ago

Oh wow I can't believe I didn't catch that, sorry for the issue, problem is solved now.