zenml-io / mlstacks

A series of Terraform based recipes to provision popular MLOps stacks on the cloud.
https://mlstacks.zenml.io/
Apache License 2.0
245 stars 32 forks source link

Integrate with `localstack` for testing AWS deployments #145

Closed marwan37 closed 4 months ago

marwan37 commented 4 months ago

Describe changes

I implemented a LocalStack integration for MLStacks to create a local testing environment for AWS deployments, addressing the objectives outlined in issue #136. This includes setting up LocalStack to emulate AWS services (e.g., S3, DynamoDB), creating a test environment within tests/integration for running integration tests, and adding a new workflow file in .github/workflows to execute these tests. The aim was to enhance development efficiency and reliability by allowing local testing in a simulated AWS environment.

Pre-requisites

Please ensure you have done the following:

Types of changes

Detailed Description

Following the task description, I developed a POC using LocalStack to emulate AWS services relevant to MLStacks deployments, focusing on S3 and DynamoDB for aws-remote-state, and S3 with Skypilot enabled for aws-modular. This involved:

I propose integrating the aws-integration-test.yml workflow into the main ci.yml, like this:

jobs:
  aws-integration-test:
    uses: ./.github/workflows/aws-integration-test.yml

  aws_test:
    name: aws_test
    needs: aws-integration-test

This allows us to run LocalStack integration tests before proceeding with AWS-specific tests, and without cluttering the main CI pipeline.

Additional Context

This integration aims to provide a framework for future testing enhancements, and serves as a basis for LocalStack-based tests within MLStacks.

Note: This integration does not include EKS based services due to the limitations of LocalStack's Community Edition, as discussed with @strickvl. Future enhancements will be revisited upon the completion of issue #140.

coderabbitai[bot] commented 4 months ago

[!IMPORTANT]

Auto Review Skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

Tips ### Chat There are 3 ways to chat with CodeRabbit: - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit-tests for this file.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit tests for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository from git and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit tests.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger a review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - The JSON schema for the configuration file is available [here](https://coderabbit.ai/integrations/coderabbit-overrides.v2.json). - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json` ### CodeRabbit Discord Community Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback.
strickvl commented 4 months ago

@marwan37 you'll need to add a step to the ci.yaml file so that the new integration tests runs, something like:

ubuntu-integration-test:
    needs: ubuntu-setup-python-environment # possibly remove if it doesn't need this?
    strategy:
      matrix:
        os: [ubuntu-latest]
        python-version: ["3.9", "3.11"]
      fail-fast: false
    uses: ./.github/workflows/aws-integration-test.yml
    with:
      os: ${{ matrix.os }}
      python-version: ${{ matrix.python-version }}
    secrets: inherit

Otherwise it won't run with PRs etc..

marwan37 commented 4 months ago

Update: I kept encountering failures with the new workflow file during the aws-modular integration test, due to issues with capturing outputs in the runner environment. A Stack Overflow post suggested using terraform-bin instead of terraform and disabling terraform_wrapper. This change finally solved the issue and correctly captured the stack-yaml-path as shown below:

- name: Output Stack YAML Path
   id: set_output
   run: |
     OUTPUT=$(terraform-bin output -raw stack-yaml-path)
     echo "stack_yaml_path=$OUTPUT" >> $GITHUB_OUTPUT
  working-directory: src/mlstacks/terraform/aws-modular
  env:
    terraform_wrapper: false

An adjustment was also necessary to use an absolute path in the test script:

- name: Run Tests to Verify Resource Provisioning
  run: |
     STACK_YAML_PATH="${{ steps.set_output.outputs.stack_yaml_path }}"
     ABSOLUTE_PATH="${GITHUB_WORKSPACE}/src/mlstacks/terraform/aws-modular/${STACK_YAML_PATH}"
     ../../../../tests/integration/aws-modular/verify_stack.sh "$ABSOLUTE_PATH"
  working-directory: src/mlstacks/terraform/aws-modular

Finally, I simplified the integration test setup by replacing tflocal with a manual override approach. This involved using an _override.tf file in tests/integration to override the aws provider without setting any resource configurations to preserve the flexibility of our .tfvars files. The _override.tf file is copied to the appropriate mlstacks/terraform module for each test and is removed afterwards. This simplifies the CI configuration in ci.yml to:

  localstack-aws-integration-test:
    uses: ./.github/workflows/aws-integration-test.yml
    secrets: inherit

Neither Python nor a specific OS is needed to run the test, and it can run concurrently with others.

strickvl commented 4 months ago

@marwan37 seems like one of the scripts needs to be made an executable? At least that seems to be the failure?

marwan37 commented 4 months ago

Yep! Fixed and repushed.

strickvl commented 4 months ago

@marwan37 thanks as always for your contribution!