Closed Aekansh-Ak closed 4 months ago
Hey there, @Aekansh-Ak! 👋 Great to see you back with us. I remember our last adventure through the code. I'm diving into your issue now and will get back to you with some insights shortly. 🕵️♂️
You need to either pass in the llm
as a kwarg
get_response_synthesizer(..., llm=llm)
or change the global defaults
from llama_index.core import Settings
Settings.llm = llm
Based on the issues you're facing, it seems like there are two main problems: the missing OpenAI API key and the permission denied error when downloading the model. Here's how you can address these:
Missing API Key Error: This error is due to the OpenAI API key not being set. You can set this key in two ways:
OPENAI_API_KEY
environment variable with your OpenAI API key. This is checked in the validate_openai_api_key
function which looks for the API key in the environment variables if it's not passed as a parameter.openai.api_key
. This can be done as follows:
import openai
openai.api_key = 'your_api_key_here'
Replace 'your_api_key_here'
with your actual OpenAI API key.
Permission Denied Error When Downloading the Model: This issue is likely due to file system permissions. Ensure that the user running the script has the necessary permissions to download and save files in the intended directory. This might involve changing directory permissions or running the script as a user with the appropriate permissions. If the script is trying to save files in a directory that requires elevated permissions, consider saving to a different location or adjusting the permissions of the target directory.
Regarding the get_response_synthesizer
function, it is a factory function that creates and returns instances of response synthesizer classes based on specific criteria or parameters provided to it. The response_mode
parameter is used to select the appropriate synthesizer class. The possible values for this parameter are defined by the ResponseMode
enum, and each value corresponds to a different synthesizer class. If the response_mode
does not match any of the expected values, the function raises a ValueError
indicating an unknown mode.
I hope this helps! If you have any other questions or need further clarification, feel free to ask.
Relevant sources:
My llm is defined like this-:
pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens = 4000, streamer= streamer, torch_dtype=torch.float16, device_map="auto", top_p=0.9, temperature = 0.3, eos_token_id=tokenizer.eos_token_id )
llm = HuggingFacePipeline(pipeline=pipe)
response_synthesizer = get_response_synthesizer(llm=llm) TypeError: get_response_synthesizer() got an unexpected keyword argument 'llm'
This is a langchain LLM class no? Need to wrap it with a langchain wrapper
https://docs.llamaindex.ai/en/stable/examples/llm/langchain.html
Also, I think you have an older version of llama-index? That's definitely a valid kwarg in newer versions
Upgraded the llama-index version, I am getting this error-:
File "/usr/local/lib/python3.10/site-packages/llama_index/core/indices/prompt_helper.py", line 148, in _get_available_context_size context_size_tokens = self.context_window - num_prompt_tokens - self.num_output TypeError: unsupported operand type(s) for -: 'tuple' and 'int'
Code-: ####################################################### client = OpensearchVectorClient( endpoint, idx, dim=1024, embedding_field=embedding_field, text_field=text_field, use_ssl = False, verify_certs = False )
vector_store = OpensearchVectorStore(client)
model_id = "lmsys/vicuna-13b-v1.5" model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto', torch_dtype=torch.float16 )
tokenizer = AutoTokenizer.from_pretrained(model_id) streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
embeddings = HuggingFaceInstructEmbeddings(model_name="intfloat/multilingual-e5-large", model_kwargs={"device": "cpu"})
text_split = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=200) Settings.text_splitter = LangchainNodeParser(text_split)
pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens = 4000, streamer= streamer, torch_dtype=torch.float16, device_map="auto", top_p=0.9, temperature = 0.3, eos_token_id=tokenizer.eos_token_id )
llm = HuggingFacePipeline(pipeline=pipe) llm = LangChainLLM(llm)
Settings.llm = llm Settings.embed_model = embeddings Settings.context_window=4096, Settings.num_output=256, Settings.chunk_overlap_ratio=0.1, Settings.chunk_size_limit=None
vsi = VectorStoreIndex.from_vector_store( vector_store = vector_store, )
query_engine = vsi.as_query_engine() res = query_engine.query("query")
Okay, I solved the error, this needs to be changed in prompt_helper.py-:
context_size_tokens = self.context_window[0] - num_prompt_tokens - self.num_output[0]
Will test the response synthesizer now
Question Validation
Question
I just imported response Synthesizer and tried to use it, I get this error.
my_synthesizer = get_response_synthesizer(response_mode="compact")
Could not load OpenAI model. Using default LlamaCPP=llama2-13b-chat. If you intended to use OpenAI, please check your OPENAI_API_KEY. Original error: No API key found for OpenAI. Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization. API keys can be found or created at https://platform.openai.com/account/api-keys
Downloading url https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q4_0.gguf to path /tmp/llama_index/models/llama-2-13b-chat.Q4_0.gguf Error downloading model: [Errno 13] Permission denied: '/tmp/llama_index/models/llama-2-13b-chat.Q4_0.gguf' Download incomplete. Removing partially downloaded file. Traceback (most recent call last): File "/home/seceon/opensearch_lm_index/model.py", line 111, in
response_synthesizer = get_response_synthesizer(