KTH / devops-course

Repository of the DevOps course at KTH Royal Institute of Technology DD2482
191 stars 427 forks source link

AIOps / MLOps / Infrastructure and software engineering for ML #1016

Open monperrus opened 3 years ago

monperrus commented 3 years ago
monperrus commented 3 years ago

https://github.com/machine-learning-apps/actions-ml-cicd A Collection of GitHub Actions That Facilitate MLOps

monperrus commented 3 years ago

Machine learning operations with GitHub Actions and Kubernetes - GitHub Universe 2019 https://www.youtube.com/watch?v=Ll50l3fsoYs

monperrus commented 2 years ago

TinyMLOps: Operational Challenges for Widespread Edge AI Adoption https://arxiv.org/abs/2203.10923

mrbgco commented 2 years ago

Azure MLOps.

AWS MLOps.

monperrus commented 2 years ago

Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream processing https://beam.apache.org/

monperrus commented 2 years ago

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. https://www.kubeflow.org/

monperrus commented 2 years ago

Tensorboard A suite of visualization tools to understand, debug, and optimize TensorFlow programs for ML experimentation https://www.tensorflow.org/tensorboard

monperrus commented 2 years ago

"In the coming decade, all software development will be assisted by AI. Either the code is going to be generated with the help of AI, or it is going to be reviewed by AI, tested by AI, or even deployed by AI." https://www.tabnine.com/blog/from-ci-to-ai-the-ai-layer-in-your-organization/ https://youtu.be/6YQX0LGaNy8

monperrus commented 2 years ago

Training and Serving Machine Learning Models at Scale. (arXiv:2211.05516v1 [cs.SE])

bbaudry commented 1 year ago

Quality Assurance in MLOps Setting: An Industrial Perspective. http://arxiv.org/abs/2211.12706

bbaudry commented 1 year ago

Edge Impulse: An MLOps Platform for Tiny Machine Learning http://arxiv.org/abs/2212.03332

monperrus commented 1 year ago

Edge Impulse: An MLOps Platform for Tiny Machine Learning. http://arxiv.org/pdf/2212.03332

bbaudry commented 1 year ago

A Data Source Dependency Analysis Framework for Large Scale Data Science Projects. http://arxiv.org/abs/2212.07951

monperrus commented 1 year ago

A fault injection platform for learning AIOps models.

monperrus commented 1 year ago

Studying the Characteristics of AIOps Projects on GitHub

monperrus commented 1 year ago

Building Machine Learning Models Like Open Source Software CACM

bbaudry commented 1 year ago

The Pipeline for the Continuous Development of Artificial Intelligence Models -- Current State of Research and Practice.

http://arxiv.org/abs/2301.09001

monperrus commented 1 year ago

Scalable End-to-End ML Platforms: from AutoML to Self-serve.

monperrus commented 1 year ago

Towards a change taxonomy for machine learning pipelines EMSE 2023

bbaudry commented 1 year ago

Scaling MLOps education https://github.com/readme/guides/mlops-education

bbaudry commented 1 year ago

Open Source Feature Store for Production ML https://feast.dev/

monperrus commented 1 year ago

seldon-core: An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models https://github.com/SeldonIO/seldon-core

monperrus commented 1 year ago

MLflow and Azure Machine Learning https://learn.microsoft.com/en-us/azure/machine-learning/concept-mlflow

monperrus commented 1 year ago

Semgrep rules for ML

monperrus commented 1 year ago

MLOps in google cloud with Vertex AI: Orchestrate machine learning (ML) workflows using Vertex AI Pipelines.

https://cloud.google.com/vertex-ai/docs/pipelines

monperrus commented 1 year ago

Learning Representations on Logs for AIOps. (arXiv:2308.11526v1 [cs.CL])

monperrus commented 1 year ago

LLMOps: Research and technology for building AI products w/ foundation models. General technology for enabling AI capabilities w/ (M)LLMs: MiniLLM (LLM Distillation), LLM Accelerator, Structured Prompting, Extensible Prompts, and Promptist. Effective and efficient approaches to deploying large AI models in practice: MiniLM(-2), xTune, EdgeFormer, and Aggressive Decoding

https://thegenerality.com/agi/about.html

monperrus commented 1 year ago

Kserve Standardized Serverless ML Inference Platform on Kubernetes https://github.com/kserve/kserve

monperrus commented 1 year ago

Neptune: Track, compare, and share your models in one place https://neptune.ai/

monperrus commented 1 year ago

DVC: ML Experiments Management with Git

monperrus commented 1 year ago

Amazon SageMaker

Build, train, and deploy machine learning (ML) models with Amazon infrastructure, tools, and workflows.

https://aws.amazon.com/sagemaker/

monperrus commented 7 months ago

run-house: Iterate and deploy AI workloads on your own infra. Unobtrusive, debuggable, PyTorch-like APIs https://github.com/run-house/runhouse/

bbaudry commented 6 months ago

Master thesis, Purdue University, 2024 A Quantitative Comparison of Pre-Trained Model Registries to Traditional Software Package Registries

monperrus commented 5 months ago

Ten Commandments To deploy fine-tuned models in prod https://docs.google.com/presentation/d/1IIRrTED0w716OsU_-PL5bONL0Pq_7E8alewvcJO1BCE/edit#slide=id.g2c28ff05645_0_0

monperrus commented 1 month ago

Langfuse - LLM engineering platform for model tracing, prompt management, and application evaluation. Langfuse helps teams collaboratively debug, analyze, and iterate on their LLM applications such as chatbots or AI agents.