zilliztech / knowhere

Knowhere is an open-source vector search engine, integrating FAISS, HNSW, etc.
Apache License 2.0
175 stars 73 forks source link

Use DataSetPtr in all knowhere interfaces #666

Closed cydrain closed 2 months ago

cydrain commented 3 months ago

To support multi data type, PR #210 introduced IndexNodeDataMockWrapper::xxx, in which DataSet reference is converted into shared_ptr. This operation has panic risk, because if the DataSet is not created by make_shared, but created in stack, calling shared_from_this will cause bad_weakptr. To avoid this kind of risk, need change all DataSet& to DataSetPtr in all knowhere APIs.

For details please check https://github.com/zilliztech/knowhere/pull/210 and https://github.com/zilliztech/knowhere/pull/659

alexanderguzhva commented 3 months ago

I would personally stick with std::shared_ptr<DataSet> instead of creating an alias 'DataSetPtr'. Just because it is confusing (DataSetPtr kinda implies DataSet*) Also, const std::shared_ptr<DataSet> should be passed by ref, because passing shared_ptr by value adds an overhead of atomic increments and decrements of shared_ptr ref counters.

So, not

    virtual expected<DataSetPtr>
    Search(const DataSetPtr dataset, const Config& cfg, const BitsetView& bitset) const = 0;

but

    virtual expected<std::shared_ptr<DataSet>>
    Search(const std::shared_ptr<DataSet>& dataset, const Config& cfg, const BitsetView& bitset) const = 0;
github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.