jina-ai / executors

internal-only
Apache License 2.0
31 stars 12 forks source link

HNSW: Rename to Indexer #277

Open cristianmtr opened 2 years ago

cristianmtr commented 2 years ago

Since we can call .index directly on it, this should be renamed to a HnswIndexer since it can handle both indexing (CRUD) and searching.

Would require new name in hub, new secrets, and removing the old one (to avoid confusions)

tadejsv commented 2 years ago

Disagree - you should be able to call .index on all searchers - how else do you get the data in? Having everything have the same name (suffix) is misleading and fails to immediately alert user to the fact that searchers can only index vectors, and not actual content

cristianmtr commented 2 years ago

No, check the docs

Searchers do not take indexing/update/delete by default. If some of them do, it's a bonus.

But the categories are:

tadejsv commented 2 years ago

Searchers do not take indexing/update/delete by default. If some of them do, it's a bonus.

Then we need to change this definition. There is no reason why they shouldn't be able to do this.

MOST can NOT index directly, but have to load from dump or sync

Again, why? Not the case for Faiss - it can index directly (for update/delete you have to implement some solution yourself - but same holds for any indexer using this searcher), not the case for HNSW

cristianmtr commented 2 years ago

Design decision from the past. We cannot guarantee it for every Searcher we might implement. If they can, it's a bonus and they become Indexers.

tadejsv commented 2 years ago

We cannot guarantee it for every Searcher we might implement

For example? If we can do it for faiss, we can do it for everything. We really should have a very good reason to be stating such assumptions

cristianmtr commented 2 years ago

It can be arbitrarily complex. We cannot guarantee that we will do it. Until we do it for a given Searcher it should be called Searcher. When we do it, we call them Indexers.