unum-cloud / usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
https://unum-cloud.github.io/usearch/
Apache License 2.0
2.27k stars 143 forks source link

Bug: segfault on clustering #529

Open vibl opened 2 weeks ago

vibl commented 2 weeks ago

Describe the bug

I get a segmentation fault on index.cluster(), whatever the parameters min_count and max_count I use (and without parameters).

Small dataset of 80,000 vectors. index.search works great.

Steps to reproduce


    index = Index(
        ndim=768,
        metric='cos',
        dtype='f32'
    )

    index.save(usearch_index_path)
    index = Index.restore(usearch_index_path)

    clustering = index.cluster(min_count=10, max_count=15, log=True)

Expected behavior

It should return a Clustering instance.

USearch version

2.16.2

Operating System

Ubuntu 24.04

Hardware architecture

x86

Which interface are you using?

Python bindings

Contact Details

No response

Are you open to being tagged as a contributor?

Is there an existing issue for this?

Code of Conduct

ashvardanian commented 2 days ago

Hi @vibl! Sorry for delayed response! Can you check out the global clustering functionality as opposed to the built-in into the Index Python class? It should work much better.