zenml-io / mlstacks

A series of Terraform based recipes to provision popular MLOps stacks on the cloud.
https://mlstacks.zenml.io/
Apache License 2.0
250 stars 32 forks source link

Implement Label Studio Annotator Stack Component #134

Open strickvl opened 9 months ago

strickvl commented 9 months ago

MLStacks currently supports the concept of an annotator stack component but lacks an actual implementation of such a component. This task aims to integrate Label Studio as an annotator stack component within MLStacks, enhancing the framework's capabilities in data annotation.

Task Description

Develop a deployed instance of Label Studio as part of the MLStacks framework. This involves several key steps, including updating enums and constants to recognize Label Studio as a component flavor, and creating a Terraform module for deploying Label Studio on cloud providers (AWS and/or GCP).

Expected Outcome

Steps to Implement

  1. Update src/mlstacks/enums.py to include label_studio as a component flavor under the annotator category.
  2. Modify src/mlstacks/constants.py to recognize annotator as a stack component and add label_studio to the list of permitted flavors.
  3. Design and implement a Terraform module for deploying Label Studio, considering Kubernetes/Helm and VM instance deployment options. Reference Label Studio's installation guides for Kubernetes/Helm and VM instance deployments. (Remember to ensure that the Label Studio instance is backed by persistent storage, esp when using container-based deployments.
  4. Ensure the Terraform module exports critical deployment information (e.g., Label Studio URL, access credentials) as outputs.
  5. Conduct thorough testing of the Label Studio deployment via MLStacks on the chosen cloud provider(s), verifying functionality and access.
  6. Document the implementation process and provide usage instructions, including how to configure the Label Studio annotator stack component in MLStacks.

Additional Context

Integrating Label Studio as an annotator stack component will significantly broaden MLStacks' utility in data annotation workflows, offering users a flexible and powerful tool for labeling data across various use cases.

Code of Conduct

strickvl commented 7 months ago

Note this is similar to #150, but this issue would be for a deployment that was to the cloud provider of choice.

samsmithspace commented 3 months ago

Hi @strickvl, I'd like to work on this task.

strickvl commented 3 months ago

@samsmithspace You're welcome to work on this! Let us know if you have any questions.