Open GildeshAbhay opened 3 months ago
Hey @GildeshAbhay! Great to see you diving into another interesting challenge with LlamaIndex. How's everything going on your end?
Yes, your understanding is correct. By setting Settings.llm
to "gpt-3.5-turbo" and Settings.embed_model
to "text-embedding-3-small" at the top of your RAG pipeline code, you are ensuring that embeddings are created using "text-embedding-3-small" while the query engine uses "gpt-3.5-turbo".
The LlamaIndex handles these settings through properties with lazy initialization and setter methods. This means that the LLM (Language Learning Model) and the embedding model are only resolved when they are first accessed, which provides flexibility and efficiency in managing the configurations of the LlamaIndex.
When you set Settings.llm
and Settings.embed_model
, you are explicitly setting these properties, which triggers the setter methods. These methods resolve the provided LLM type and embedding model type using the resolve_llm
and resolve_embed_model
functions respectively, and assign them to _llm
and _embed_model
.
As for additional considerations, you should ensure that the embedding model is compatible with the inputs and outputs expected by the rest of your pipeline. Also, any necessary preprocessing or postprocessing steps should be taken into account. However, without more specific details about the service context or the rest of the pipeline, it's difficult to provide a comprehensive list of additional considerations.
You can find more details about how LlamaIndex handles these configurations in the llama-index-core/llama_index/core/settings.py file.
Service context is deprecated. Settings
controls the global defaults for things
You can also pass them into interfaces as needed instead of globals. For example, from your code
index = VectorStoreIndex(nodes=nodes, storage_context=storage_context, embed_model=embed_model)
# you didn't actually use this anywhere, but this is what uses the LLM in a query engine
response_synthesizer = get_response_synthesizer(response_mode=response_mode, llm=llm)
query_engine = RetrieverQueryEngine(retriever, node_postprocessors=node_postprocessors, response_synthesizer=response_synthesizer)
can i also pass llm to query_engine?
query_engine = RetrieverQueryEngine(retriever, node_postprocessors=node_postprocessors, response_synthesizer=response_synthesizer, llm=llm)
would it be redundant since llm is already passed in response synthesizer?
@GildeshAbhay exactly, the response synthesizer already has the LLM, so in the above the arg is unused
Question Validation
Question
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.2) Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small",)
If i give these two lines at the top of the code, will it ensure that embeddings are created using text embedding 3 small while the query engine uses llm ? Any other things to take care at this stage, like in service context, emebdding model is also used separately?