Open simonw opened 5 months ago
I asked about terminology on Twitter a couple of weeks ago: https://twitter.com/simonw/status/1774278907380019637
input_type
- values query
and document
input_type
of search_document
, search_query
, classification
, clustering
I'm thinking:
llm embed -c 'hello world' -m nomic-1.5 --mode clustering
The llm embed
and llm embed-multi
commands will default to the one that is designed for stored documents.
llm similiar
will default to the one that's intended for retrieval.
All three commands will accept a --mode
option to switch to something other than the default for that command.
Modes will be validated against the list of known modes for the embedding model.
So maybe the code looks like this:
class NomicAIEmbeddingModel(EmbeddingModel):
needs_key = "nomic"
key_env_var = "NOMIC_API_KEY"
batch_size = 100
modes = ["search_document", "search_query", "clustering", "classification"]
default_document_mode = "search_document"
default_query_mode = "search_query"
The selected mode is then passed as an argument to the embed_batch()
method - but only for models that defined modes
.
I'm tempted to have modes defined as an enum
of some sort, that way the Python API for embeddings could look something like this:
vector = nomic.embed("reasons to get a goat", mode=nomic.Modes.search_query)
And maybe the class then looks like this:
from enum import Enum
class NomicAIEmbeddingModel(EmbeddingModel):
...
class Modes(Enum):
search_document = "search_document"
search_query = "search_query"
clustering = "clustering"
classification = "classification"
default_mode_search = Modes.search_query
default_mode_document = Modes.search_document
I considered using StrEnum
but it was only added in Python 3.11.
Several embedding models supported by LLM plugins have a concept of "modes" - usually called something like "task types" or "input types".
Some examples:
passage:
andquery:
prefixes.We need a mechanism to support these in LLM core itself, mainly for the
llm similar
command - we need to calculate the original stored embeddings forRETRIEVAL_DOCUMENT
(in Gemini's terminology) but the search query should beRETRIEVAL_QUERY
.