Configure LLM to use while querying

sid8491 commented 1 year ago

I know for building the index we can configure the LLM like below:

llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003"))
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index = GPTVectorStoreIndex.from_documents(
    documents, service_context=service_context
)

But how can I configure the LLM for querying?

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")

Will the LLM be same for both querying and indexing? I want to use different LLM for indexing the data, then load the index and use different LLM to get the response of the query.

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

Is it possible?

Disiok commented 1 year ago

You can pass in a service context in index.as_query_engine(service_context=service_context)

sid8491 commented 1 year ago

You can pass in a service context in index.as_query_engine(service_context=service_context)

how to do so? i want to use different LLM for embeddings and different LLM for searching. could you please give some sample code.

Disiok commented 1 year ago

Take a look at this for customizing embeddings: https://gpt-index--1183.org.readthedocs.build/en/1183/how_to/customization/embeddings.html

run-llama / llama_index

Configure LLM to use while querying #1994