opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
94 stars 129 forks source link

[RFC][FEATURE] Model alias #1217

Open b4sjoo opened 1 year ago

b4sjoo commented 1 year ago

Introduction:

Right now our model versions are identified by random IDs. Meanwhile, most of our ML-related behaviors at this time require user to refer the model ID directly, which are hard to remember and use due to its abstractness. This can lead to confusion and errors when the users want to refer to a specific model or compare different models. To address this issue, we propose to design and implement a model alias feature for our ml-commons. This feature will allow users to assign a custom alias to each model they create or use, and then use this alias as a reference instead of the model ID. For example, a user can name their model "text_embedding" or "sentiment_analysis" and then use these alias in their queries or commands.

Benefits of this feature:

Where model id is being used now:

  1. The get/delete/deploy/undeploy/predict model API required model ID right now.
  2. In Neural Search Plugin, in order to ingest vectorized documents, you need to create a Neural Search pipeline. A pipeline consists of a series of processors that manipulate documents during ingestion, allowing the documents to be vectorized. The following example API operation creates a Neural Search pipeline:
PUT _ingest/pipeline/nlp-pipeline
{
  "description": "An example neural search pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "bxoDJ7IHGM14UqatWc_2j",
        "field_map": {
           "passage_text": "passage_embedding"
        }
      }
    }
  ]
}

Neural search query also needs model id, for example

            "query": {
              "neural": {
                "passage_vector": {
                  "query_text": "Hello world",
                  "model_id": "xzy76xswsd",
                  "k": 100
                }
              }
            },

Pain point

Suppose the user have a new model version and they want to use it to replace the old version. They need to update the pipeline by changing the model_id. User also need to change model id in neural search query.

Another example: User build some application with OpenAI remote model, and later they prefer to move to another model like Claude model, then user have to change the model id in their application code and redeploy. That will have service unavailable window. With model alias, user can simply move alias from one model to another, then all request will be routed to the new model, no need to redeploy, no service unavailable window.

Solution HLD:

Option 1: The model alias is globally unique

Plan:

  1. Model alias is an optional field when the user register a model (null is a legal value)
  2. User can give or modify(including remove) model alias anytime through model update API
  3. Every time when user create or modify the model alias, we will initiate a global uniqueness checking towards the model alias given by the user. If it does not meet the global uniqueness requirement, an exception will be thrown.
  4. User can re-link model alias to a different model through model update API

pros:

  1. This solution does not require too much changes in the backend
  2. This solution requires less checks

cons:

  1. Generally user intends to create intuitive and meaningful names, which are very likely to be similar and can easily cause collision. This indicates user frustration when alias number grows.
  2. The user may need to keep update model alias once they update their model.

Solution LLD:

Plan:

  1. This solution mainly involved in three methods: create-alias, update-alias, and delete-alias
  2. Global uniqueness check: we check both on model ID and model aliases.
  3. Create-alias: API endpoints use model ID to locate the model. We first initiate a global uniqueness checking. If it does not meet the global uniqueness requirement, an exception will be thrown. If global uniqueness check passed, we return a success message.
  4. Update-alias: API endpoints use model ID to locate the model. We first check if the model group has the alias. If yes we relink the alias. If not we throw an exception.
  5. Delete-alias: We check if user can access the model, if yes then the corresponding alias can be removed and be released to public

Security

  1. DDOS
  2. Abuse of update-alias method (Use update-alias without communicating with collaborators)

Backward Compatibility

Testability

HenryL27 commented 1 year ago

@b4sjoo This is a great idea, love it!

In the case where the alias namespace gets crowded, it would be very helpful to have some kind of search-aliases API to see what names are taken, and perhaps where they point so I don't end up duplicating these unnecessarily.

Another option to consider could be to have alias names be unique on some kind of access-control basis: it could be very frustrating to try to create aliases but then be unable because someone's already done so in an access level above you. I'm not sure that's a real thing, but something to think about.

Can you explain the LLD update-alias in more detail? I'm having a little trouble understanding. Can a model have multiple aliases associated with it?

Thanks

austintlee commented 1 year ago

Will connectors get aliases, too?

saratvemulapalli commented 1 year ago

+1 this will be a great user experience and will help with model versioning and MLClient interactions.

Couple of suggestions and questions:

Global uniqueness check: we check both on model ID and model aliases.

Option 1: The model alias is globally unique

Where model id is being used now

[1] https://github.com/opensearch-project/ml-commons/blob/2.x/docs/model_access_control.md

ylwu-amzn commented 1 year ago

Will connectors get aliases, too?

I think we can add alias for connector too. @austintlee , Will it be useful? Any concern?

dswitzer commented 10 months ago

What if the model_id could just be manually specified instead of randomly generated?

Instead of creating alias functionality around the ID, you could just specify the ID of your choosing when you create the various objects. If the ID attribute is left off, then it's auto-generated as it is today, but when specified the ID is used. Users could then decide if they want to use a "text-friendly" ID (e.g. "neural_search_text_embedding_model"), a UID of their own or the one OpenSearch creates.

Would that simplify things?

There's probably some technical reasons why this might not be feasible (and why it wasn't done from the start), but I'm just wondering if this wouldn't simplify things if there are no technical reasons why the ID cannot just be specified during creation.

dswitzer commented 10 months ago

+1 for this feature.

I'm trying to work on scripts so we can provision our servers. We deploy to varies different environments, so having IDs that are consistent across deployments makes the chain much easier to manage, especially since our indices are created on-the-fly.

Right now, there's no simple solution to referencing the model workflow, since the IDs will be different on every instance. The application stack needs to search of the registered models by name and find their IDs and then store that for use in the application and then we need to look it up any time we need to run some one-off queries.

Having the ability to use an "alias" (or define our own ID) would definitely simplify our workflow.

If you need more details on our provisioning workflow issue, there's a thread here: https://forum.opensearch.org/t/provisioning-models-for-deployment-in-application/17036

rbpasker commented 4 months ago

+1 for the problem -1 for the solution

I just posted something like this on the slack channel: https://opensearch.slack.com/archives/C05BGJ1N264/p1718823006253339

why not make model name unique in the same way that an index is unique?

adding yet another layer of indirection -- alias -- seems to be going in the wrong direction, which is ease-of-use

if you're going to add aliases, why not just use @ notation for them like "model" : "@mymodelalias"or mustache "{{{mymodelalias}}}". then you can use them anywhere, like for connections and indexes.

ylwu-amzn commented 3 months ago

Thanks @rbpasker ,

why not make model name unique in the same way that an index is unique?

This is some historical issue. The model name was designed not as identifier (no uniqueness check) at first. Considering we have many users are using this and they may have models which same name. If we change the model name design to be an identifier, will be challenging to support backward compatible (for example, user have two models with same name, if they use same model nam, we don't know which model should be used). Adding an alias is easier for supporting backward compatible.

IanMenendez commented 3 months ago

I agree with @rbpasker the current proposed solution will add more friction and complexity.

will be challenging to support backward compatible (for example, user have two models with same name, if they use same model nam, we don't know which model should be used). Adding an alias is easier for supporting backward compatible.

easier != the best solution

Why not something like this?: if models.get(name).size > 1 then fail("there are 2 models with the same name please update accordingly) else do whatever

rbpasker commented 3 months ago

What purpose does a model name even serve, if it is not unique, is never used by the system anywhere else,and there is a separate model description?

Alternatively, make model name optionally unique. With no change to the model definition, it works exactly as now. By adding "model-unique":true to the model definition, system would do a uniqueness check, and return HTTP 409 Conflict if it already exists https://www.rfc-editor.org/rfc/rfc7231#section-6.5.8

This has the benefits of backward compatibility and having the smallest surface area for implementation and testing hi

ylwu-amzn commented 2 months ago

@rbpasker Thanks for your suggestion. Sounds an option.

Want to clarify some details

By adding "model-unique":true , system would do a uniqueness check, and return HTTP 409 Conflict if it already exists

Do you mean add another field model-unique in model or add a cluster setting to check uniqueness for all models?

Adding a field to model will be "smallest surface area for implementation and testing".

rbpasker commented 2 months ago

@ylwu-amzn i updated the comment to refer specifically to the model definition.