aws / copilot-cli

The AWS Copilot CLI is a tool for developers to build, release and operate production ready containerized applications on AWS App Runner or Amazon ECS on AWS Fargate.
https://aws.github.io/copilot-cli/
Apache License 2.0
3.46k stars 399 forks source link

Websocket connection with Redis message queue #1756

Open sinedie opened 3 years ago

sinedie commented 3 years ago

Hey there, awesome work here. I'm new to the cloud and copilot makes it easier for me.

I have a web app with four main containers: frontend -svelte/routify served with nginx-, REST API -regular flask app-, websockets API -flasksocketio- and a async server -python celery workers- for long task like pdf generation and more stuff -15 mins or something like that-.

I was wondering if copilot permits websockets connection out of the box or do I have to do something else?.

Also, the websocket-api and the celery server communicates over a redis message broker. Is there a way to include this on copilot?

Ty for the anwsers and keep going with this awesome work

iamhopaul123 commented 3 years ago

Hello @sinedie, we do allow services to be able to talk to each other through service discovery. Based on your description looks like in current Copilot world, you could have an LB web service as the frontend (maybe with nginx as a sidecar) and a backend service as the worker service. Both of them can be scalable and they can talk to each other by configuring the service discovery in your services. Let us know if this solves your use case and the feature gap if it doesn't!

sinedie commented 3 years ago

Ty for the quick anwser @iamhopaul123 .

Service discovery seems really cool, it solves something I was not sure about -the only service that connects to the database is the REST API, so I needed some way to connect with that service-, but not entirely solves my use case...

Maybe I didn't explain too well... The frontend app is a 3D view with collaborative work -to keep it simple, think it like this: a user can select a box and it becomes of another color and other connected users can see it-, thats why I need a websocket connection.

The main websocket server is connected to Redis cause that gives me the possibility of broadcast a message from an external process as well -if I want to send a message to all clients I can send it from anywhere in my app (except the frontend of course)-. Thats kind of nice cause the user on the client can see a message when the long task on celery workers end -again 15 min or so-. For this service I think I can erase the Redis and take a different aproach, but with celery...

The celery worker is not connected with websockets, thats not the reason of the websocket connection. It needs a message broker to work. Again, a Redis instance -maybe the same used with the websockets service-. Note: I use an async server cause the task are really long and on a regular api the connection timeout.

**If u dont understand something tell me, maybe was my bad english

Ty again for the quick response.

iamhopaul123 commented 3 years ago

Oh I see what you mean and why you need the websocket connection between services. So it seems like we need a queue processing service and allow websocket connection between services to support the use case.

sinedie commented 3 years ago

Just tried a websocket example and the connection works... so copilot gives websocket connection out of the box. That make me happy XD -just tried one service tho, idk if spliting frontend and websocket api still works-

Maybe for the websockets connection between the diferent backends I can communicate via service discovery? and let the websocket service broadcast everything. Just need to check if frontend - websocket api connection works if i split them.

Finally, the Redis -or any other message queue, maybe elastic cache, SQS, idk, really new to AWS- could be a nice feature.

Gonna keep trying splitting those services. Keep with this great job,

iamhopaul123 commented 3 years ago

Awesome! Great to hear it works at its initial attempt. Message queue support is already on our todo list! I'll keep this issue open in case more feature gaps are found and will keep you posted for our queue support progress.

Also, it could be a temporary workaround to specify the message queue as an addon, which allows users to define additional aws resources that Copilot doesn't have a pattern for.

sinedie commented 3 years ago

Hey... still trying to split frontend and websocket api... I'm stuck

I created an app with this two services and hardcoded the ws-back url -in the format I understood from service discovery (this is only for quick test, I will change it to be an env variable)- in the frontend code. Frontend works, it is just a button that emit a message and then the server should broadcast it back and console log the message. Locally with docker compose it works, but in AWS nothing happens when click the button.

image

image

image

Guess it doesn't work and I should join together this two services? or did I do something wrong? ty in advance

sinedie commented 3 years ago

OMG!! I found a workaround to keep the two services separated. It is not ideal I think, but works for my case.

Instead of using a backend service for the websocket API, I expose it to the internet like a load balanced web app. In this way I can change the manifest.yml and set the route for the API to '/socket.io' and the load balancer on top can handle the request from the internet -this avoids CORS too-.

Is a way to make services to talk over websockets... maybe u don't want to expose the api that way, but I don't think thats a big deal if you don't access the database from there, isn't?... still u have the load balancer on top... idk, welll, it worked for me.... Here is what I did

image

The only trouble now is that the service is rebooting himself after some minutes of usage... I guess it its cause the healtchecks? i didn't provide a healtcheck for that service... Gonna try and I'll tell u if something happens

image

image

image

sinedie commented 3 years ago

It was the healtcheck

Before: image

After: image

image

iamhopaul123 commented 3 years ago

Yayy!! This is awesome! Yeah since ALB supports websocket maybe it makes more sense to just use them as LB services. Also, is there any other feature request like the message queue? Did you create it as an addon or find a workaround to it?

Again this is so good! Glad to know it is working!

Edited: Configuring security group might not help for controlling the path level traffic for ALB. We need to figure out a better way for dealing with the websocket connection between services if users don't want to expose the backend endpoint.

sinedie commented 3 years ago

I'm still figuring out how to create the addon. Never had done this before so... u know XD.

A question... I have 3 services that uses that redis queue. I need to access the same queue on all of them. If I use an addon, do i have to define it on all 3 services? and if I do, it still points to the same queue? or create 3 redis clusters?

iamhopaul123 commented 3 years ago

I took a look at it seems like it is pretty difficult to broadcast the cluster endpoint address if you are creating one cluster as an addons. I think it would work for 3 clusters but then you'll have 3 different endpoints. Do you need them to share the same elasticache (aws redis) cluster or it is ok to have one for each celery worker?

As for elasticache addon configuration, to take this CFN template as an example: you might need to replace the subnet IDs with the private subnets we are using for services (you can find it in the environment CFN stack). And we'll

Define an IAM ManagedPolicy resource in your template that holds the permissions for your task and add an Output so that the permission is injected to your ECS Task Role. Create an Output for any value that you want to be injected as an environment variable to your ECS tasks.

sinedie commented 3 years ago

Well... the only service that need redis or some other message queue is the celery worker. The backend API just needs to insert the task to the redis queue for the celery starts to work. So, yeah, I need to communicate with the cluster from the backend API and the celery worker and needs to be the same cluster...

This cluster is not for persistence, just for starting the long task. -I can do a different aproach with websockets, no problem-

It is something like this image

sinedie commented 3 years ago

Just thinking in a workaround...

I can have a small REST API together with the celery workers in a container (two process in one container... maybe the main could be the api). This way I can open some endpoints to start the workers inside the async services and make all the request via service discovery. That way only need redis on that async service.

Gonna try it this way. Still, sharing these kind of resources between services could be a nice feature.

I'll keep u updated and share the repo with an example in case someone else have the same use case. -if i manage to add that redis addon lol-

iamhopaul123 commented 3 years ago

Thank you so much for giving us so detailed explanation on your application architecture. It is very helpful for us to design the upcoming queue pattern with this use case. Let us know if you have any question on the addons!

sinedie commented 3 years ago

So... I decided to use SQS -due to price and cause I don't have any data saved- Adding the SQS was 'easy' after research a little, idk if I do it right... maybe need permissions? idk

image

Still trying to join the api and the celery worker, no results at the moment... Wish I could use the SQS url and connect from the other service, but all good. Gonna keep trying tomorrow.

iamhopaul123 commented 3 years ago

So as you might notice if you need some task role permissions to access the sqs queue. For example sqs:SendMessage for sending messages to the queue. And you will need to put those policies in the addon teamplate's AWS::IAM::ManagedPolicy in order to inject those policies to your service task role.

As for getting the SQS url from the other service, let's say you define sqs as an addon for service A and service A can get the sqs url by env var. For the other services maybe they can get the url by querying through service discovery?

sinedie commented 3 years ago

So... are you saying me that if I have an endpoint on service A that provides the SQS url, I can use that url in my other services and connect to it without issues?...

iamhopaul123 commented 3 years ago

Yeah i think so. As long as you have the right permission configured for the task role of the other services that use that URL. I can try to get an example work in the weekend and get back to you.

sinedie commented 3 years ago

I'm going to try that... maybe that can solve my use case. Keep you infomated about it. Thank you for the quick reply

iamhopaul123 commented 3 years ago

Hi @sinedie so I just created an example app with two services frontend and backend. In the backend I defined the SQS queue using addons like this:

template.yaml

```yaml # You can use any of these parameters to create conditions or mappings in your template. Parameters: App: Type: String Description: Your application's name. Env: Type: String Description: The environment name your service, job, or workflow is being deployed to. Name: Type: String Description: The name of the service, job, or workflow being deployed. Resources: MyQueue: Type: AWS::SQS::Queue Properties: QueueName: "MyQueue" QueueAccessPolicy: Type: AWS::IAM::ManagedPolicy Properties: PolicyDocument: Version: 2012-10-17 Statement: - Sid: SQSActions Effect: Allow Action: - sqs:ReceiveMessage Resource: !Sub ${ MyQueue.Arn} Outputs: QueueURL: Description: "URL of new Amazon SQS Queue" Value: !Ref MyQueue QueueAccessPolicyArn: Description: "The ARN of the ManagedPolicy to attach to the task role." Value: !Ref QueueAccessPolicy ```

So that the backend will be able to only receive message from queue and then the queue URL can be accessed by the backend as env var with name "QUEUE_URL". And the frontend can get the URL from backend and then send message to the queue with the required permission:

template.yaml

```yaml # You can use any of these parameters to create conditions or mappings in your template. Parameters: App: Type: String Description: Your application's name. Env: Type: String Description: The environment name your service, job, or workflow is being deployed to. Name: Type: String Description: The name of the service, job, or workflow being deployed. Resources: QueueAccessPolicy: Type: AWS::IAM::ManagedPolicy Properties: PolicyDocument: Version: 2012-10-17 Statement: - Sid: SQSActions Effect: Allow Action: - sqs:SendMessage Resource: * Outputs: QueueAccessPolicyArn: Description: "The ARN of the ManagedPolicy to attach to the task role." Value: !Ref QueueAccessPolicy ```

sinedie commented 3 years ago

OK... first, thank you. You're awesome. I only had set AWS::IAM::ManagedPolicy to the service with the addon and I was a little confused on how to do it right. This is really awesome.

I haven't tried yet, gonna try today.

Again, thank u.

sinedie commented 3 years ago

Didn't worked, at least not with celery, but thats ok, by the moment I'll deploy Redis in a ECS container like a backend service. I know it is not the best option, but it works right now -it doesn't persist any data and the trafic is for my company (max 20 ppl) so, for a first attempt is good-

Ty for the help. Should I leave this issue open?

I'll Wait for that queue support :D. Copilot devs are really awesome, keep doing this great job.

iamhopaul123 commented 3 years ago

Sorry to hear that! We'll try to make this pattern possible and easier to use. I'll keep you posted.

Should I leave this issue open?

Yes please as part of this problem remains unresolved.