aws / copilot-cli

The AWS Copilot CLI is a tool for developers to build, release and operate production ready containerized applications on AWS App Runner or Amazon ECS on AWS Fargate.
https://aws.github.io/copilot-cli/
Apache License 2.0
3.54k stars 417 forks source link

Scheduling management commands #5013

Closed gabelton closed 1 year ago

gabelton commented 1 year ago

I've successfully deployed my main load balanced web service (a django app) and now want to deploy a scheduled job, in order to periodically run a django management command for that django app. Previously I used a scheduled Jenkins build job, passing cloudfoundry app details as parameters, to do this. I'm looking for some guidance.

  1. Is a scheduled job the right tool for this in copilot currently? Or should I consider bringing the logic inside the app with celery beat or similar?

  2. I'm trying (unsuccessfully) to point my job's manifest.yml file at the same ecr image that the main web app uses. I'm passing the django management command in via the command field in the job yaml. Is this correct practice?

Hope that makes sense - let me know, if not. Many thanks!

KollaAdithya commented 1 year ago

If you want to run django management command perodically, Copilot scheduled job is the right way to go.

I'm trying (unsuccessfully) to point my job's manifest.yml file at the same ecr image that the main web app uses. I'm passing the django management command in via the command field in the job yaml. Is this correct practice?

If you are pointing the same ECR image and using the command field in the manifest.yml. Are you getting an error? Can provide what the error when you run copilot job deploy.

gabelton commented 1 year ago

Thanks for replying @KollaAdithya . It's good to know that the general approach is right at least. So I ran copilot job init and then copilot deploy and saw the success message below:

Screenshot 2023-06-23 at 08 30 50

But when I look at the logs I just see:

Screenshot 2023-06-23 at 08 32 01

python manage.py migrate && waitress-serve --port=$PORT config.wsgi:application: python manage.py migrate && waitress-serve --port= config.wsgi:application is the contents of the application image's Procfile. I don't know why it's trying to run this, rather than the command I've included in the job manifest

KollaAdithya commented 1 year ago

Hey can you provide your manifest for the job and the Dockerfile that you are using?

gabelton commented 1 year ago

Included with sensitive data redacted.

Job manifest:

# The manifest for the "cleartokens" job.
# Read the full specification for the "Scheduled Job" type at:
#  https://aws.github.io/copilot-cli/docs/manifest/scheduled-job/

# Your job name will be used in naming your resources like log groups, ECS Tasks, etc.
name: cleartokens
type: Scheduled Job

# Trigger for your task.
on:
  # The scheduled trigger for your job. You can specify a Unix cron schedule or keyword (@weekly) or a rate (@every 1h30m)
  # AWS Schedule Expressions are also accepted: https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html
  schedule: "@hourly"
#retries: 3        # Optional. The number of times to retry the job before failing.
#timeout: 1h30m    # Optional. The timeout after which to stop the job if it's still running. You can use the units (h, m, s).

# Configuration for your container and task.
image:
  location: [private-repo-location]staff-sso/web:latest

command: python manage.py cleartokens

cpu: 256       # Number of CPU units for the task.
memory: 512    # Amount of memory in MiB used by the task.

# Optional fields for more advanced use-cases.
#
#variables:                    # Pass environment variables as key value pairs.
#  LOG_LEVEL: info

#secrets:                      # Pass secrets from AWS Systems Manager (SSM) Parameter Store.
#  GITHUB_TOKEN: GITHUB_TOKEN  # The key is the name of the environment variable, the value is the name of the SSM parameter.

# You can override any of the values defined above by environment.
#environments:
#  prod:
#    cpu: 2048               # Larger CPU value for prod environment.

The image above is the same one used by our main web service manifest.

Dockerfile:

FROM [private-google-container-registry-location]py-node:3.10

RUN apt-get update && apt-get install -y xmlsec1

# ENV POETRY_VIRTUALENVS_CREATE=false
# RUN poetry config installer.max-workers 10

WORKDIR /app

COPY . /app

# RUN poetry install --with dev
RUN pip install -r /app/requirements-dev.txt

CMD honcho start
gabelton commented 1 year ago

@KollaAdithya did you have any more thoughts on this? Wondering whether or not to change approach here

KollaAdithya commented 1 year ago

Hey @gabelton !

Sorry for the late reply 🙇 . Can you use image.build instead of image.location. Not exactly sure this will help in your scenario. I tried using image.build and it worked for me.

gabelton commented 1 year ago

Thanks for getting back to me @KollaAdithya . I've just redeployed the job using image.build and pointing to a local Dockerfile. When I run copilot job logs I see django.core.exceptions.ImproperlyConfigured: Set the SAML_REDIRECT_RETURN_HOST environment variable. Do I need to set environment variables for a job? It feels like I'd just be duplicating everything I have for the main web service

Lou1415926 commented 1 year ago

@gabelton Hey there - happy to help!

Do I need to set environment variables for a job?

If your job needs the env var SAML_REDIRECT_RETURN_HOST, then yes - you'll need to set that environment variable via. the variables section of your job's manifest.

It feels like I'd just be duplicating everything I have for the main web service

Yes :( Environment variables injected via variables or secrets are not a part of the image - they are passed into the container (that's built on the image) when ECS spins up the container. Therefore, even if you are reusing the same image as your main web service, you still have to pass the same environment variables in the job's manifest.

In your case, I feel like #4122 and #2699 would best reflect what you'd look for - sharing configurations among services/jobs. Would you mind give #2699 a 👍🏼 ?


I know you've tried to use image.location, but I'm still trying to understand why image.build didn't work for you - I think it should. If you are still interested in using trying image.location instead of image.build, we can try ⬇️ ; otherwise, feel free to just ignore this section under the separation line.

python manage.py migrate && waitress-serve --port=$PORT config.wsgi:application: python manage.py migrate && waitress-serve --port= config.wsgi:application is the contents of the application image's Procfile.

I'm not super familiar with honcho and Procfile-based application, but I guess you are not expected to see these commands because docker should run python manage.py cleartokens instead of honcho start. Am I right about this?

Can you run copilot job package (potentially with --output-dir flag, if that helps you better), and check the field TaskDefinition/Properties/ContainerDefinitions[0]/Command?

gabelton commented 1 year ago

Hi @Lou1415926, thanks v much for this. I've just given https://github.com/aws/copilot-cli/issues/2699 the thumbs up. After duplicating the secrets in the manifest and redeploying it all seems to be working, so thank you again for that.

I think ideally we would prefer to use image.location though, as we're keeping all of our copilot config in a separate repo to the main project and its Dockerfile. When I switch back to image.build and deploy and run copilot job package I see this for command:

Screenshot 2023-07-06 at 14 06 44

Then when I run copilot job logs I see the same python manage.py migrate && waitress-serve --port=$PORT config.wsgi:application: python manage.py migrate && waitress-serve --port= config.wsgi:application: command not found error.

I've since got it working with image.location by overriding entrypoint with launcher python manage.py cleartokens. I chatted with my team and I think I was missing part of the picture. Paketo was building the image with Procfile until I added that entrypoint line.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no response activity. Remove the stale label, add a comment, or this will be closed in 14 days.

github-actions[bot] commented 1 year ago

This issue is closed due to inactivity. Feel free to reopen the issue if you have any further questions!