opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
95 stars 129 forks source link

[RFC] OpenSearch ML model artifacts releasing SOP #2174

Open model-collapse opened 7 months ago

model-collapse commented 7 months ago

Background:

With the impact growth of OpenSearch ml-commons, we observe that some people are proposing to contribute model artifacts to the community. Meanwhile, some features should come up with its default model artifact such as neural search and neural sparse. With the release of agent framework and more features on search relevance, log analytics and . Now it is necessary to discussion about the SOP (Standard Operation Procedure) of model artifacts releasing.

Parameter storage

Today, most of the deep learning models are having huge amount of parameters, it is impossible to store model parameter file inside the github repos. We encourage the contributors to upload their models to Hugginface or other cloud-hosted OSS and include the file URL in the repo.

Discussion 1: Where shall we release the models?

Currently Opensearch team is releasing our model artifacts in one page on document site and the model artifacts are only about neural search and neural sparse. We need a better place to publish those model artifacts as well as other materials about the model.

Option 1:

Adding one folder in ml-commons repo one in openserach-skills and one in neural. Each model will have a dedicated sub directory. For example: ml-commons.git   ┣client   ┣ ml-algorithms   ┣ ....   ┗ models     ┣tool-selection       ┣ inference.py       ┣ tool-selection.pynb       ┗ ....     ┗ index-selection       ┣ inference.py       ┣ tool-selection.pynb

Pros:

Cons:

Option 2:

Creating a new repository hold those model artifacts. All opensearch related ML models should be gathered in this repo. Each model is hosted in sub directory under the repo's root. Each sub directory should have a README.md, describing the model's usage, especially which OpenSearch module it is working with.

Pros

Cons

The SOP:

1, The contributor should open an issue in the related repo, describing what the model artifact will be look like and which feature the model artifacts will be supporting. 2, If not major concern from the community, PRs can be pushed to the main branch. In the PR description, following things should be provided: a. list of files, having the description to each file. The reviewers should carefully review the model execution scripts, especially on security. b. The benchmark result showing the comparison to other baseline models. The maintainer should invite a science reviewer to check the benchmark result as another review to the PR. 3, After the PR is merged, the opensearch team will start the process of updating the model release page of the documentation website if needed.

Suggested & Minimum Requirements:

For each model, we are holding a minimum bar on the release artifacts and giving our suggested artifacts.

Online Deployed Models

For those online deployed model, no matter it will be deployed inside OpenSearch cluster or a 3rdparty inference endpoint: Artifact Mandatory? Notes
README.md Should include: Descriptions, OS Mod to support, Source repo URL (if there is), Benchmark result, example API calls of ml-commons deploy or remote connector creation
parameters.json A file contains the URL to the model parameter file. Format {"model name": "URL"}
demo.pynb A code-with-tutorial jupyter notebook on how to play with the model
deploy-local.py A script to deploy the model on one single node and serve the model as a endpoint
deploy-.py - Scripts to deploy the model on 3rd party platforms

Offline Deployed Models

For those models which just executed offline, no deployment of endpoint is required: Artifact Mandatory? Notes
README.md Should include: Descriptions, OS Mod to support, Source repo URL (if there is), Benchmark result
parameters.json A file contains the URL to the model parameter file. Format {"model name": "URL"}
demo.pynb A code-with-tutorial jupyter notebook on how to play with the model

Discussion 2: When publishing a feature RFC, should the issue of its model release be opened at the same time?

Option 1: Yes, and the model issue should be linked with the feature issue.

Pros: If the audience of the feature RFC is curious about the model, they can track the model release as well. Cons: More workload to write duplicate context for both issues. Meanwhile, describing on thing in two places will lead bring information loss.

Option 2: No, just include the model information in the feature RFC.

Pros: Less workload, the readers can only read on issue and won't miss anything. Cons: No place to track all the model releases together. When reading the issue, engineering guys may get lost if there's to much sciencific descriptions to the model.

dhrubo-os commented 7 months ago

Currently we release models in the release team's S3 bucket using opensearch-py-ml's model release workflow

Could you please add another point describing why we can't extend our existing workflow to support all our need.

Currently Opensearch team is releasing our model artifacts in one page on document site and the model artifacts are only about neural search and neural sparse. We need a better place to publish those model artifacts as well as other materials about the model.

This is not completely correct. Opensearch release team is pushing us to create/upgrade our automated model publishing workflow so that we don't need to depend on release team. So in theory we can easily enhance this workflow in an automated fashion so that we can easily release model based on customer need.

xinyual commented 7 months ago

Could you please add another point describing why we can't extend our existing workflow to support all our need.

From my understanding, current workflow could only publish model weights. But sometimes we need to publish some code for training and inference, as python script or jupyter notebook. Current workflow could not do so. But I notice we already have some jupyter examples here https://github.com/opensearch-project/opensearch-py-ml/tree/main/docs/source/examples. Can we directly use this repo for option 1?