Closed aarora79 closed 1 year ago
@aarora79 Thanks for looking into the fix. Do you want to submit a PR for this change? I can help verify the change on my end once you do.
Thank you for fixing this @3coins .
Seeing this with embeddings still:
langchain: 0.0.249
raise ValueError(f"Error raised by inference endpoint: {e}")
ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: The provided inference configurations are invalid
edit: issue seems to be related to the need for text splitting at the loader:loader.load_and_split(text_splitter)
I am getting the same issue with langchain 0.0.249
@3coins @aarora79 Hi both, I was getting this error. So upgraded to langchain 0.0.249 in hope of getting this fix. I'm using langchain with amazon bedrock service and still get the same symptom. If I pass an empty inference modifier dict then it works but I have no clue what parameters are being used in AWS world by default and obv. have no control.
@paul-bradbeer-adv What models and which langchain LLM/chain are you using?
I was able to fix the non-vector/non-embedding issues with all of their models (titan, a21, and claude)
As far as with embeddings, I think there may be an internal bug with Bedrock (and definitely titan). Reach out to your internal AWS team to add onto the bug we opened.
Another bug worth pinging your team about -- you get the same error message for everything without any details. The same error could be because the context size is too small, the output is too big, the # of inputs is not what’s expected, the parameters for one model do not map across another, using an inference interface with a chat model, or it could be a random error.
@3coins @aarora79 @ventz This is now working for me. I am thinking perhaps env issues caching an older version and a restart flushing things through. Or something changed between 0.0.259 (as I used yesterday) and 0.0.260 (working today) which I haven't yet checked for.
@paul-bradbeer-adv Interesting - good to know.
It could be a cache, because I am using: langchain==0.0.256
in one environment and langchain==0.0.259
in another - both working for non-embeddings.
I got same error ValueError: Error raised by inference endpoint: An error occurred (ValidationException) when calling the InvokeModel operation: The provided inference configurations are invalid
in sagemaker. langchain==0.0.256 Image: Data Science 3.0 Kernel: Python 3 Instance type: ml.t3.medium 2 vCPU + 4 GiB
I'm trying to get embedding from those chunks that splited by RecursiveCharacterTextSplitter
Here is the code:
bedrock_embeddings = BedrockEmbeddings(client=boto3_bedrock)
loader = DirectoryLoader("./Sources", glob="**/*.txt", loader_cls=TextLoader, silent_errors=True)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(
chunk_size = 1000,
chunk_overlap = 100,
)
docs = text_splitter.split_documents(documents)
vectorstore_faiss = FAISS.from_documents(
docs,
bedrock_embeddings,
)
wrapper_store_faiss = VectorStoreIndexWrapper(vectorstore=vectorstore_faiss)
funny thing is if my doc is smaller(docs[:5]), it worked. vectorstore_faiss = FAISS.from_documents( docs[:5], bedrock_embeddings, )
@HannaHUp do you get the error if you set chunk_size
to 512? That was what was causing it for us -- any docs under 512 tokens were handled, but longer stuff gave this (super unhelpful) error.
@rmartine-ias I have tried other text that its len is 1418. I got no issue though.
What is maybe important is the limit is in tokens -- does using this splitter work?
from langchain.text_splitter import TokenTextSplitter
docs = TokenTextSplitter(
encoding_name="gpt2",
chunk_size=512,
chunk_overlap=100,
).split_documents(data)
(we tested and titan seems to use the GPT2 embeddings)
System Info
LangChain version 0.0.190 Python 3.9
Who can help?
@seanpmorgan @3coins
Information
Related Components
Reproduction
Tried the following to provide the
temperature
andmaxTokenCount
parameters when using theBedrock
class for theamazon.titan-tg1-large
model.This results in the following exception
This happens because https://github.com/hwchase17/langchain/blob/d0d89d39efb5f292f72e70973f3b70c4ca095047/langchain/llms/bedrock.py#L20 passes these params as key value pairs rather than putting them in the
textgenerationConfig
structure as the Titan model expects them to be,The proposed fix is as follows: