CUDA out of memory - Githubissues

aws-solutions-library-samples / guidance-for-natural-language-queries-of-relational-databases-on-aws

Demonstration of Natural Language Query (NLQ) of an Amazon RDS for PostgreSQL database, using SageMaker JumpStart, Amazon Bedrock, LangChain, Streamlit, and Chroma.

MIT No Attribution

58 stars 10 forks source link

You mentioned usinghg-flan-xxl. It is noted in the README that "The demonstration's google/flan-t5-xxl-fp16 is capable of answering basic natural language queries with sufficient in-context learning, but will fail to return an answer, provide incorrect answers, or cause the model endpoint to experience timeouts due to resource exhaustion when faced with moderate to complex questions. Users are encouraged to experiment with a variety of open source and commercial JumpStart Foundation Models and Amazon Bedrock." If you are using google/flan-t5-xxl-fp16 via Amazon SageMaker Studio JumpStart, then this is a known limitation.

aws-solutions-library-samples / guidance-for-natural-language-queries-of-relational-databases-on-aws

CUDA out of memory #57