Issue with index.add_items() when building large indexes?

When building indexes of varying sizes I ran into some issues with some of the larger sizes..

Here's what my index creation code looks like:

# imagine `vectors` is an ndarray with multiple vectors of dimension 1728 ...
num_dimensions = vectors[0].shape[0]
index = Index(Space.Cosine, num_dimensions=num_dimensions)
index.add_items(vectors)
index.save(filename)

And my test code looks like this:

# imagine `vector` is a sample query vector of (matching) dimension 1728
index = Index.load(filename)
index.query(vector, k=200)

This works fine when vectors is of cardinality 10k, 50k, 100k, 500k, and 1M ...

but when vectors has 5M or 10M vectors in it, index creation runs fine, but upon querying ...

     index.query(vector, k=200)
RuntimeError: Potential candidate (with label '4963853') had negative distance -17059.800781. This may indicate a corrupted index file.

I tried creating the index with slices of the same vectors array of size 1M:

start = 0
while start < vectors.shape[0]:
    end = start + batch_size
    index.add_items(vectors[start:end])
    start = end

and it seems I can query this index just fine. Maybe some sort of limitation with the add_items() function?

spotify / voyager

Issue with index.add_items() when building large indexes? #48