zenml-io / mlstacks

A series of Terraform based recipes to provision popular MLOps stacks on the cloud.
https://mlstacks.zenml.io/
Apache License 2.0
245 stars 32 forks source link

`mlstacks` package and associated logic #67

Closed strickvl closed 10 months ago

strickvl commented 11 months ago

This PR builds on the work listed in the PRD to improve the mlops-stack UX.

TL;DR:

‼ Key Changes / Updates

⦾ Core concepts / mental models

The core flow is as follows:

The stack and component specs are defined through Pydantic models (defined in src/mlstacks/models/ directory) which handles the high-level validation.

The Terraform files / modules themselves are (currently) located at src/mlstacks/terraform, though our current idea is to also move those out to the Terraform Registry as soon as is possible.

NOTE: there are a bunch of terraform modules / directories located at the root of the mlops-stacks repository. Those are there to handle backwards compatibility (i.e. people using old versions of ZenML, since we don't want to prevent them from using stack recipes the old way if they want).

Video Walkthrough

ScreenShot 2023-08-09 at 11 39 54

Loom Video Link

👷 How to try me out

What follows are some simple CLI commands you can use to try things out to get a sense of how it all works. Within mlstacks we only allow for the deployment of stacks (i.e. deployment of a single component requires you to define it in relation to a stack).

First things first: you'll need to install the package into your local environment:

git clone -b feature/PLATFORM-176-stacks-goes-py-pi git@github.com:zenml-io/mlops-stacks.git
pip install -e ".[dev]"

🥞 Stack Deployment Example Use

First create two files. simple_stack.yaml first:

spec_version: 1
spec_type: stack
name: "test_stack"
provider: gcp
default_region: "europe-north1" # for GCP
default_tags:
  z-env: "dev"
  z-owner: "YOUR_NME"
  z-team: "platform"
  z-project: "stack-recipes"
components:
- simple_component_gcs.yaml

Then simple_component_gcs.yaml:

spec_version: 1
spec_type: component
component_type: "artifact_store"
component_flavor: "gcp"
name: "test_gcs_bucket"
provider: gcp
metadata:
  config:
    bucket_name: "zenml-test-stack-recipes-bucket"
    project_id: "zenml-core"
  tags:
    z-env: "dev"
    z-owner: "YOUR_NAME"
    z-team: "platform"
    z-project: "stack-recipes"
    z-description: "test that stack recipes v2 works"
  region: "europe-north1"

To deploy an artifact store to GCP:

mlstacks deploy -f simple_stack.yaml

Then you can check on GCP that the bucket exists here.

Then you can delete the artifact-store / bucket with the following command:

mlstacks destroy -f simple_stack.yaml

= Deployment Outputs Example Use

(You'll need to have something deployed in order for this to work.)

To view the outputs for a particular deployment:

mlstacks output -f simple_stack.yaml
ScreenShot 2023-08-09 at 11 05 08

This will print out a list of the values. You can also get a specific k:v pair in the following way:

mlstacks output -f simple_stack.yaml -k stack-yaml-path

💰 Infracost Cost Estimation Example Use

To get a cost estimation for a particular stack:

mlstacks breakdown -f simple_stack.yaml
ScreenShot 2023-08-09 at 11 07 39

🚧 Remaining / Missing Tasks

(For the full breakdown of remaining tasks, see our board on Notion.)

gitguardian[bot] commented 10 months ago

⚠️ GitGuardian has uncovered 5 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
| GitGuardian id | Secret | Commit | Filename | | | -------------- | ------------------------- | ---------------- | --------------- | -------------------- | | [-](https://dashboard.gitguardian.com/incidents/secrets) | Generic Terraform Variable Secret | 8043ce782258ff948dc82d5ab9cd80459cbff19e | aws-kubeflow-kserve/variables.tf | [View secret](https://github.com/zenml-io/mlstacks/commit/8043ce782258ff948dc82d5ab9cd80459cbff19e#diff-5fd2aaf452190ab83f3aad4d0c1ffd52L9) | | [-](https://dashboard.gitguardian.com/incidents/secrets) | Generic Terraform Variable Secret | 8043ce782258ff948dc82d5ab9cd80459cbff19e | aws-minimal/variables.tf | [View secret](https://github.com/zenml-io/mlstacks/commit/8043ce782258ff948dc82d5ab9cd80459cbff19e#diff-d5bd4af5b1cf635f7f6a6a21c538084dL9) | | [-](https://dashboard.gitguardian.com/incidents/secrets) | Generic Terraform Variable Secret | 8043ce782258ff948dc82d5ab9cd80459cbff19e | aws-modular/variables.tf | [View secret](https://github.com/zenml-io/mlstacks/commit/8043ce782258ff948dc82d5ab9cd80459cbff19e#diff-75abfaaefa4278163c4bd60c1bd851e0L72) | | [-](https://dashboard.gitguardian.com/incidents/secrets) | Generic Terraform Variable Secret | ca7cc3dd890e5fdbe51c0d791a3f78ed7a254e2c | src/mlstacks/terraform/aws-kubeflow-kserve/variables.tf | [View secret](https://github.com/zenml-io/mlstacks/commit/ca7cc3dd890e5fdbe51c0d791a3f78ed7a254e2c#diff-6d24d8957f288b6d20d6324177e89193R9) | | [-](https://dashboard.gitguardian.com/incidents/secrets) | Generic Terraform Variable Secret | ca7cc3dd890e5fdbe51c0d791a3f78ed7a254e2c | src/mlstacks/terraform/aws-minimal/variables.tf | [View secret](https://github.com/zenml-io/mlstacks/commit/ca7cc3dd890e5fdbe51c0d791a3f78ed7a254e2c#diff-8b2c73ecd388679cf57f416de76dccf4R9) |
🛠 Guidelines to remediate hardcoded secrets
1. Understand the implications of revoking this secret by investigating where it is used in your code. 2. Replace and store your secrets safely. [Learn here](https://blog.gitguardian.com/secrets-api-management?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment) the best practices. 3. Revoke and [rotate these secrets](https://docs.gitguardian.com/secrets-detection/detectors/generics/generic_terraform_variable#revoke-the-secret?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment). 4. If possible, [rewrite git history](https://blog.gitguardian.com/rewriting-git-history-cheatsheet?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment). Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data. To avoid such incidents in the future consider - following these [best practices](https://blog.gitguardian.com/secrets-api-management/?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment) for managing and storing secrets including API keys and other credentials - install [secret detection on pre-commit](https://docs.gitguardian.com/ggshield-docs/integrations/git-hooks/pre-commit?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment) to catch secret before it leaves your machine and ease remediation.

🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!