spotify / voyager

🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
https://spotify.github.io/voyager/
Apache License 2.0
1.26k stars 51 forks source link

Incorrect python typehint for `add_items` #23

Open markkohdev opened 11 months ago

markkohdev commented 11 months ago

The python typehint for add_items (at least in IntelliJ) incorrectly names the second parameter numpy_float32 instead of ids image

I'm guessing this is due to something wonky in either our bindings or in pybind11. I also noticed that there's a tiny typo in the docstring for the method -- id -> ids https://github.com/spotify/voyager/blob/main/python/bindings.cpp#L433C7-L433C7

markkohdev commented 11 months ago

Looks like this happens with query() as well image

eXamadeus commented 7 months ago

UPDATE: My comment below is definitely not related, but some others may come along and see it, seeing the same issue I did, so I'll leave it here.

I don't know if this is related, but when you use a static method (like load), it also complains since it thinks that the first argument should be self even though the load method and certain other functions are, in fact, statically available on the class.

Ex:

from voyager import Index

loaded = Index.load(f"indexes/{path}.voy")
image

It seems to only have type hints for instances. Then again, I haven't written production code for Python in years (before type hints were ubiquitous), so I might just have something configured poorly.

image

EDIT: I dug through the source code and found where the load method is being bound statically. Definitely seems like a totally unrelated issue, so sorry for posting it here. Anyway, for those curious, here is the definition for the load method: https://github.com/spotify/voyager/blob/f7807579e9ce96e56ce55d37baa1b24e0951f0ea/python/bindings.cpp#L970-L978

I am not sure why def_static produces the following for the Python stub:

    def load(self, *args, **kwargs): # real signature unknown; restored from __doc__

I would assume it would look like this, but I'm completely unfamiliar with pybind11.

    @staticmethod
    def load(*args, **kwargs): # real signature unknown; restored from __doc__
markkohdev commented 7 months ago

@eXamadeus thanks for highlighting this! Your issue isn't exactly the same thing, but it's close enough that I can try to fix both of these at once when I'm done with my current improvement.

For context, since we use python bindings written in c++ to call out to the core C++ code, we actually have to generate these typehints based on the c++ binding code. We do that with this script and so somewhere in the process the staticmethod indicator likely isn't getting acknowledged. The binding code itself does contain an indicator that the method is static so we should be able to populate that accordingly.

eXamadeus commented 7 months ago

Yeah what I noticed definitely isn't directly related, at best it is ancillary since it relates to the bindings. Either way, thank you for the context.

I'll poke around in my freetime to see if I can help. I can't promise much, since it's been even longer since I've written any C++, haha.