aws-samples / rag-with-amazon-bedrock-and-opensearch

Opinionated sample on how to build and deploy a RAG application with Amazon Bedrock and OpenSearch
MIT No Attribution
16 stars 1 forks source link

ELB behind API Gateway #1

Closed aiasmartakis closed 4 months ago

aiasmartakis commented 4 months ago

Hello,

Thank you for the extensive example. I was easily able to switch it from openai to AWS Titan and the latest Claude LLM and get it up and running (and integrated using CI/CD pipelines!). To have it fit better in our architecture I want to put the ELB in a private subnet and use API Gateway with VPCLinks, but am facing 2 issues.

  1. For some reason I can't get the VPCLink to forward to a private subnet (I followed this setup: https://docs.aws.amazon.com/apigateway/latest/developerguide/http-api-private-integration.html) and setup a Custom Domain Name in API Gateway that maps to the API that has that VPCLink integration that 'should' forward to the ELB. I setup our certificate on the custom domain. If I leave it public the custom domain seems to work, however so does the ELB url (which I dont want) as I'd like to put a custom authenticator lambda on the route.
  2. However even with it being public I also run into the that the rag-app uses websockets and those aren't easily forwarded on API Gateway (I'm still trying....). Is this at all feasible, it seems mixing websockets and HTTP/REST apis on the same custom domain is not supported? Am curious on your perspective and what you would recommend.

Regards, Aias

tanveerg commented 4 months ago

Hi @aiasmartakis ,

Thanks for playing around with the solution!

I am actually glad to hear that you were able to switch this over to AWS Titan embeddings. I was having a bit of trouble with Titan embeddings myself - because it required chunking the documents (as it properly wouldn't take the entire document for embedding) . Would you be able to do a Pull Request for Titan if you have time? It would be great to know how you got those bits to work.

I'd like to understand what is your target architecture - and why do you think putting the ALB in a private subnet would be useful.

If it is a matter of bringing in your own custom domain into the mix, then I would suggest using Route53 to register that, create certificates et al and proceed that way.

In its current implementation, the rag-app is secured via Cognito. On top of this, you can add WAF for firewall needs (so you don't have to worry about that stuff)

WAF will ensure your public domain / app has rules in place so you can choose what traffic to let in and what traffic to explicitly deny.

The other thing I notice is that the link you pasted recommends using the HTTP flavor of API gateway, I would recommend using the REST as it natively integrates with WAF. And you can do custom Lambda authorizer for making sure your app is secured. I have another sample that you could use to have that set up done - https://github.com/aws-samples/rest-api-gateway-jwt-cognito/tree/main ^ in that sample - you set up a simple Lambda function which is triggered when a request comes into the API gateway. It also features a custom lambda authorizer. All of this is hooked up to Cognito, which is used to generate JWT bearer tokens. In your case, you could replace that Lambda function with an ECS service , or a Lambda of your own that talks to Bedrock to get the response. The challenge here is that depending on Bedrock's response time, your API request could get timed out because API Gateway has a 30 second timeout limit for AWS integrations (which includes Lambda)

Streamlit does run on websockets unfortunately. If you were to completely decouple the frontend from the backend, I would try to create a separate frontend app, and a separate backend app (that interfaces with the LLM). Since I am not confident in frontend, I chose to go with Streamlit here.

Ideally, I would use API Gateway (REST) + Lambda or ECS for backend (with custom Lambda authorizers) ; and a separate frontend app (maybe checkout Amplify). It also depends if your use case warrants tracking history of the conversation, user management etc. That means you will need to introduce a database layer as well.

Happy to learn more about the use case I suppose, and maybe I could suggest better.

Apologies if my thoughts weren't best organized but I hope I answered some of what you were looking for.

aiasmartakis commented 4 months ago

thank you for your response.

Indeed I wrote a simple 'chunking' method to get Titan to work, I'll see if I can send it to you along with some library changes.

The main reason for using API Gateway was to use our existing custom-authorization lambda, that we have already, in front of the largely intact solution in this project (as an MVP). For that we removed cognito and hoped API Gateway could be used as we do for many of our solutions without much rewriting. The streamlit rag-app we then could then integrate (iframe or whatnot) within our existing web-app architecture (angular microservices running on a k8s cluster), passing in session information as needed to authenticate.

I'll check the example you have provided above and will try to switch from HTTP to a Rest API and see if that works. However I also noticed that on one custom domain you cannot combine Websocket and Rest/HTTP APIs and I wonder what to do with these then (unless REST APIs allow me to just forward it).

Indeed we already have a full blown app with databases and user management and I only need to ensure no unauthenticated users access the chat.

aiasmartakis commented 4 months ago

After some extra thought: I'm going to try to hook the ALB directly to our OpenID server and see if that works (sufficiently).

tanveerg commented 4 months ago

thank you for your response.

Indeed I wrote a simple 'chunking' method to get Titan to work, I'll see if I can send it to you along with some library changes.

The main reason for using API Gateway was to use our existing custom-authorization lambda, that we have already, in front of the largely intact solution in this project (as an MVP). For that we removed cognito and hoped API Gateway could be used as we do for many of our solutions without much rewriting. The streamlit rag-app we then could then integrate (iframe or whatnot) within our existing web-app architecture (angular microservices running on a k8s cluster), passing in session information as needed to authenticate.

I'll check the example you have provided above and will try to switch from HTTP to a Rest API and see if that works. However I also noticed that on one custom domain you cannot combine Websocket and Rest/HTTP APIs and I wonder what to do with these then (unless REST APIs allow me to just forward it).

Indeed we already have a full blown app with databases and user management and I only need to ensure no unauthenticated users access the chat.

yes, I don't think you can combine HTTP/REST with Websockets.

Iframe approach could work as it would not require a lot of re-architecting. But if you can re-work it in a way that doesn't rely on this approach, that would be better of course.

tanveerg commented 4 months ago

After some extra thought: I'm going to try to hook the ALB directly to our OpenID server and see if that works (sufficiently).

Yes, that would be a better approach IMO.