beringresearch / ivis

Dimensionality reduction in very large datasets using Siamese Networks
https://beringresearch.github.io/ivis/
Apache License 2.0
330 stars 43 forks source link

`chunk_size` in knn set to 0 #91

Closed yrahul3910 closed 3 years ago

yrahul3910 commented 3 years ago

Describe the bug It seems chunk_size in ivis.data.neighbour_retrieval.knn is set to 0 for my dataset, which has shape (6, 784).

Stack trace

Building KNN index
100%|█████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 1318.07it/s]
Extracting KNN neighbours
Traceback (most recent call last):
  File "main.py", line 17, in <module>
    viz.visualize(*embeddings, dpi=150)
  File "/Users/ryedida/Desktop/CSC522/userdata_mining/visualization/embeddings.py", line 76, in visualize
    x = self._reduce_dims(arg)
  File "/Users/ryedida/Desktop/CSC522/userdata_mining/visualization/embeddings.py", line 50, in _reduce_dims
    return ivis.fit_transform(arg)
  File "/usr/local/lib/python3.8/site-packages/ivis/ivis.py", line 336, in fit_transform
    self.fit(X, Y, shuffle_mode)
  File "/usr/local/lib/python3.8/site-packages/ivis/ivis.py", line 314, in fit
    self._fit(X, Y, shuffle_mode)
  File "/usr/local/lib/python3.8/site-packages/ivis/ivis.py", line 179, in _fit
    self.neighbour_matrix = AnnoyKnnMatrix.build(X, path=self.annoy_index_path,
  File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 60, in build
    return cls(index, X.shape, path, k, search_k, precompute, include_distances, verbose)
  File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 47, in __init__
    self.precomputed_neighbours = self.get_neighbour_indices()
  File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 93, in get_neighbour_indices
    return extract_knn(
  File "/usr/local/lib/python3.8/site-packages/ivis/data/neighbour_retrieval/knn.py", line 189, in extract_knn
    for i in range(0, data_shape[0], chunk_size):
ValueError: range() arg 3 must not be zero

Desktop (please complete the following information):

Additional context Python 3.8. embedding_dims was set to 2, k was set to 3.

Szubie commented 3 years ago

Hi, thanks for the bug report.

This sounds similar to an issue reported before that affects small datasets, we recently merged a fix into the master branch that hopefully addresses this issue (https://github.com/beringresearch/ivis/commit/82d890114b6ff3c2b950678b743152c241e49865).

We haven't yet had the change to publish the updated package on PyPI - in the meantime, could you try installing ivis directly from GitHub and see if that solves the issue?

yrahul3910 commented 3 years ago

Yes, thank you! This seems to have solved the issue.