Open jaklinger opened 4 years ago
Thanks for the PR. There is a problem though, consider a generator
from collections.abc import Iterable
def test():
for i in range(len(100)):
yield i
isinstance(test(), Iterable)
# True
len(test())
# Error
How about simplely add all the supported iterable types into the type check?
I don't think it's possible to even support iterables with the current code because it merely keeps a reference to the input set list and requires them to be indexable. So if your original data is in an iterator a copy is needed anyway.
Arguably it's kind of dirty to require the user of the library to provide the data as a list (and take care to not modify it while the index is in use, which is not documented anywhere!) but it does save the memory overhead of creating an internal copy.
Yes. That's why checking if input is iterator is not enough to prevent the error I mentioned.
This library is intended to provided an in-memory solution for similarity search. So I think it's fair to ask users to load all data into memory as a list, tuple, or ordered dict and keep them there. True it should be documented.
Closes #7
Quoting from the issue: