aws-samples / Serverless-Retrieval-Augmented-Generation-RAG-on-AWS

A full-stack serverless RAG workflow. This is thought for running PoCs, prototypes and bootstrap your MVP.
MIT No Attribution
25 stars 11 forks source link

Model Identifier Issue #28

Open CyrilParisot opened 3 days ago

CyrilParisot commented 3 days ago

Description:

I'm encountering an issue with the lambdaDocumentProcessorFun function. When invoking the model, I receive the following error:

Error raised by inference endpoint: An error occurred (ResourceNotFoundException) when calling the InvokeModel operation: Could not resolve the foundation model from the provided model identifier

Steps to reproduce:

  1. Deploy the solution as described in the repository using eu-west-3 (with all Bedrock models activated).
  2. Update AWS_REGION env from us-west-2 to eu-west-3.
  3. Invoke the lambdaDocumentProcessorFun function.

It seems like the model identifier isn't recognized or isn't the best model. Could you provide guidance on how to correctly select a model for use cases outside us-west-2?

Environment:

Model Identifier Test KO:

Model Identifier Test OK:

Thank you for your assistance!

giusedroid commented 2 days ago

Heya! Thanks for taking the time to deploy this and test it out. Trying to reproduce. I'll keep you posted!

giusedroid commented 2 days ago

Hey @CyrilParisot ! I have a working installation in Paris with all models enabled. Can you reach out via LinkedIn so I can give you a url and you can test out? I wasn't able to reproduce your bug :( https://www.linkedin.com/in/giusedroid/

CyrilParisot commented 2 days ago

I'm not able to connect today. But I will ping you Monday.

What model did you use for eu-west-3 then?

Le sam. 29 juin 2024, 13:15, Giuseppe @.***> a écrit :

Hey @CyrilParisot https://github.com/CyrilParisot ! I have a working installation in Paris with all models enabled. Can you reach out via LinkedIn so I can give you a url and you can test out? I wasn't able to reproduce your bug :( https://www.linkedin.com/in/giusedroid/

— Reply to this email directly, view it on GitHub https://github.com/aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28#issuecomment-2198109369, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIFWJP6K6SNUVD7TIQXNOCTZJ2JMLAVCNFSM6AAAAABKCBGIZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJYGEYDSMZWHE . You are receiving this because you were mentioned.Message ID: <aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28/2198109369 @github.com>

giusedroid commented 2 days ago

Everything available really. I tested with all claude, one mistral causes another bug but it's available.

CyrilParisot commented 2 days ago

Ok thanks for your support, so it should be how I deploy. I will retry. Thanks again for trying to reproduce

Have you create un branch? I can compare to local to see where I made my mistake(s)

Le sam. 29 juin 2024, 18:10, Giuseppe @.***> a écrit :

Everything available really. I tested with all claude, one mistral causes another bug but it's available.

— Reply to this email directly, view it on GitHub https://github.com/aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28#issuecomment-2198247375, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIFWJP5GXUDAUWAW2C7U6VDZJ3L7FAVCNFSM6AAAAABKCBGIZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJYGI2DOMZXGU . You are receiving this because you were mentioned.Message ID: <aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28/2198247375 @github.com>

giusedroid commented 1 day ago

When did you last clone? We released an additional feature and some bug fixing on Thursday. I'll send you a link to my deployment as soon as I am back home

On Sat, 29 Jun 2024, 17:33 @Cyril__, @.***> wrote:

Ok thanks for your support, so it should be how I deploy. I will retry. Thanks again for trying to reproduce

Have you create un branch? I can compare to local to see where I made my mistake(s)

Le sam. 29 juin 2024, 18:10, Giuseppe @.***> a écrit :

Everything available really. I tested with all claude, one mistral causes another bug but it's available.

— Reply to this email directly, view it on GitHub < https://github.com/aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28#issuecomment-2198247375>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AIFWJP5GXUDAUWAW2C7U6VDZJ3L7FAVCNFSM6AAAAABKCBGIZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJYGI2DOMZXGU>

. You are receiving this because you were mentioned.Message ID:

<aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28/2198247375

@github.com>

— Reply to this email directly, view it on GitHub https://github.com/aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28#issuecomment-2198254509, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFKSINGVJUHKBOZEEWOP4TZJ3OVBAVCNFSM6AAAAABKCBGIZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJYGI2TINJQHE . You are receiving this because you commented.Message ID: <aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS/issues/28/2198254509 @github.com>

giusedroid commented 53 minutes ago

found the root cause: in Paris Titan Embeddings TEXT v1 is not available. Here's a list of the embedding models currently available in Paris with ON_DEMAND consumption.

[
  {
    "modelArn": "arn:aws:bedrock:eu-west-3::foundation-model/amazon.titan-embed-image-v1",
    "modelId": "amazon.titan-embed-image-v1",
    "modelName": "Titan Multimodal Embeddings G1",
    "providerName": "Amazon",
    "inputModalities": [
      "TEXT",
      "IMAGE"
    ],
    "outputModalities": [
      "EMBEDDING"
    ],
    "customizationsSupported": [],
    "inferenceTypesSupported": [
      "ON_DEMAND"
    ],
    "modelLifecycle": {
      "status": "ACTIVE"
    }
  },
  {
    "modelArn": "arn:aws:bedrock:eu-west-3::foundation-model/cohere.embed-english-v3",
    "modelId": "cohere.embed-english-v3",
    "modelName": "Embed English",
    "providerName": "Cohere",
    "inputModalities": [
      "TEXT"
    ],
    "outputModalities": [
      "EMBEDDING"
    ],
    "responseStreamingSupported": false,
    "customizationsSupported": [],
    "inferenceTypesSupported": [
      "ON_DEMAND"
    ],
    "modelLifecycle": {
      "status": "ACTIVE"
    }
  },
  {
    "modelArn": "arn:aws:bedrock:eu-west-3::foundation-model/cohere.embed-multilingual-v3",
    "modelId": "cohere.embed-multilingual-v3",
    "modelName": "Embed Multilingual",
    "providerName": "Cohere",
    "inputModalities": [
      "TEXT"
    ],
    "outputModalities": [
      "EMBEDDING"
    ],
    "responseStreamingSupported": false,
    "customizationsSupported": [],
    "inferenceTypesSupported": [
      "ON_DEMAND"
    ],
    "modelLifecycle": {
      "status": "ACTIVE"
    }
  }
]

I am currently thinking about adding some parameters in parameters store at deploy time to pre-select the embedding models for each region and also check if there are enough models in the region you're attempting to deploy. We need at least one TEXT+ON_DEMAND and EMBEDDING+ON_DEMAND to make sure that the app will work...

Unfortunately we cannot be as flexible with Emeddings...