[Discussion] ml-commons tenet and architecture

This issue is to discuss the ml-commons tenet/goal, architecture and the underlying principles of design rationale.

Introduction

This repository aims to provide a collection of essential frameworks, tools and APIs to streamline the development, deployment, and management of ML/AI applications. Whether you're a data scientist, developer, or researcher, this repository offers a unified platform to build, deploy, and connect ML/AI models effortlessly.

Goals

Our primary goals for this repository are:

Standardization: Provide a standardized set of tools and APIs that can be reused across various ML/AI projects, promoting consistency and reducing development effort.
Simplicity: Abstract complex processes, such as model training, deployment, and service connection, into simple and intuitive APIs, enabling rapid development.
Flexibility: Support a wide range of ML/AI use cases and frameworks, allowing users to adapt the tools/APIs to their specific requirements.

Architecture

ml-commons-arch

The repository is structured around the following key components:

1. General Rest APIs

A comprehensive set of APIs that encompass common functionalities:

train: API to train machine learning models.
deploy: API to deploy trained models.
predict: API to make predictions using deployed models.
execute: API for executing specific tasks.

2. General Frameworks for ML/AI

General model/task management framework: An overarching framework that facilitates efficient management of ML models and tasks, including tracking, versioning, and organization.
General connector framework: A flexible framework for connecting to any remote ML services, enabling seamless integration with cloud-based ML platforms and APIs.
General agent framework (development in progress https://github.com/opensearch-project/ml-commons/issues/1161): A general agent framework to address complex problems that can't be solved by one step or solution steps can't be predefined. It will leverage tools and orchestrate steps to solve problems. It can leverage LLM for reasoning and coordination of actions.
We may build other general framework in future.

3. Client for Vertical ML Features

A client library that empowers developers to build vertical ML features and applications. For instance, plugins like neural-search utilize the ml-commons client to integrate semantic search capabilities.

Design questions

What should be in ml-commons?

General framework which is not built for a dedicated/specific vertical area. For example Agent framework which provides general agent/tool/memory interface and management APIs, this is not for a dedicated area like GenAI. It's a general framework, any user can build their own Agent, Tool and Memory to build their own vertical features.

What should not be in ml-commons?

Vertical ML applications/features. For example neural search feature which focus only on neural search area; PPL ML command which is a special PPL command supporting running ML models.

Thanks for putting this together! I think this is a good start, and I agree that ml-commons should be a standard set of tools that support various, more specific AI/ML use cases. Model training, deployment, and service connection are common across many ML use cases, and make sense at a "commons" layer. I think we need to more crisply describe what a general framework vs. a vertical framework is, because I think you can make a case that some frameworks that support a more narrow use case are general (and belong in "commons"), or vice versa.

Although we are starting with conversational search, memory, and agents in ml-commons (though as pointed out in other conversations, this is not a one way door, and all of this may end up in an AI-commons someday). I'm not sure the way we are addressing it here follows. For instance, I'm not sure what agent use cases are not tied to generative AI or LLMs. Instead, I might approach it from an angle that agents are a way to interact with AI models, similarly to how inference requires a way to interact with a model. If we're supplying the lowest level components to build agents, that could fit the story more clearly.

opensearch-project / ml-commons