[Question]: Further explanation of total embedding token usage

ihgumilar commented 3 weeks ago

Hi,

I have used this code https://docs.llamaindex.ai/en/stable/examples/callbacks/TokenCountingHandler/ to count for embedding token usage. However, I would like to get a further clarification or explanation do you have any specific breakdown about total embedding token usage in LlamaIndex ?

Does it mean the usage token for query and answer ? Further explanation is much appreciated about embedding token usage in LlamaIndex.

Thanks

logan-markewich commented 3 weeks ago

I'm not sure what you are looking for?

Indexing uses tokens to embed all your documents.

Querying uses much less, as it only needs to embed the query text

ihgumilar commented 3 weeks ago

Thanks for the explanation. Basically, I am trying to understand this

What does 8 (embedding tokens) represent ?

logan-markewich commented 3 weeks ago

It means to run the query, 8 tokens were sent to the embedding model (I.e it sent the query text to the embedding model)

run-llama / llama_index