payload = {
"text_inputs": question,
"max_length": 100,
"num_return_sequences": 1,
"top_k": 50,
"top_p": 0.95,
"do_sample": True,
}
list_of_LLMs = list(_MODEL_CONFIG_.keys())
list_of_LLMs.remove("huggingface-textembedding-gpt-j-6b") # remove the embedding model
for model_id in list_of_LLMs:
endpoint_name = _MODEL_CONFIG_[model_id]["endpoint_name"]
query_response = query_endpoint_with_json_payload(
json.dumps(payload).encode("utf-8"), endpoint_name=endpoint_name
)
generated_texts = _MODEL_CONFIG_[model_id]["parse_function"](query_response)
print(f"For model: {model_id}, the generated output is: {generated_texts[0]}\n")
Gives the following error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "model_fn() takes 1 positional argument but 2 were given"
}
Link to the notebook Retrieval-Augmented Generation: Question Answering based on Custom Dataset with Open-sourced [LangChain](https://python.langchain.com/en/latest/index.html) Library
Describe the bug Query the endpoint
Gives the following error:
To reproduce Dependencies:
!pip install sagemaker==2.181 !pip install ipywidgets==7.0.0 --quiet !pip install langchain==0.0.148 --quiet !pip install faiss-cpu --quiet
Logs