Configurable Cog endpoint URL and predict.py function names for compatibility with AWS, Azure, ... cloud providers

Xompute commented 2 years ago

Cog's 1) endpoint names (ex. /predictions in URLs) and the 2) predict.py entry function names (setup() and predict()) must be customizable/possible to be renamed (names only, however, currently their names are hardcoded) - because Cog is otherwise unusable by some cloud providers due to their own deployment requirements as they each differ in their ML docker inferencing implementation designs.

This could be achieved via some refactoring, but must be thought through properly. Do you plan to address this within the scope of the Cog project? If yes, what would be the roadmap/plan/priority?

This functionality is a deal-breaker for production/business usage on hyperscaler AI infra outside of Replicate/custom infra deployments.

Legend: ✅= is compatible; ❌ isn't compatible with Cog as of today

GCP

endpoint names ✅, predict.py function names ✅

Is well designed and is compatible with Cog as it can use any inference endpoint names and also any predict.py function names: https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#prediction See the predictRoute API for a confirmation and details: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/ModelContainerSpec

Azure

endpoint names ✅, predict.py function names ❌

Inference endpoint names are configurable AFAIK. Hence, Azure should be fine there.
Has a similar hardcoding limitation on the predict.py inference script's function names as follows:
```
setup() -> init()
predict() -> run()
```
See for details: https://github.com/Azure/azureml-examples/blob/main/cli/endpoints/online/model-1/onlinescoring/score.py and https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=python#define-a-dummy-entry-script

AWS

Endpoint names ❌ , predict.py function names ✅

Has their container runner looking for "/invocations" inference endpoint name instead of the Cog's "/predictions" endpoint as is documented here: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-how-containe-serves-requests and here: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html

bfirsh commented 2 years ago

Thanks for the suggestion @Xompute! This is totally something we want to make possible.

Making an API that is compatible with every possible infrastructure provider is likely going to get messy and hard to maintain. One thing we've been thinking about is having some concept of an adapter for using Cog models with different providers. Kind of like we have the Redis queue server.

They could either be part of Cog core, or maintained as separate libraries somehow. You could imagine installing a cog-sagemaker Python package in your model, then that could be configured to be the interface, somehow.

Xompute commented 2 years ago

I'd assume that cloud providers won't change their APIs anytime soon (Azure most likely will as they are still in Beta), as all customers would need to migrate/upgrade - hence there is some stability guarantee. But anyway, based on my limited research, I'd instead suggest (if at all viable) to aim for full customizability on the Cog's end for at the minimum:

predict.py script's function names - setup() & predict();
Endpoint route names;
HTTP request/response JSON schemas (current situation elaborated below).

I decided to give Cog a ride at least on the GCP, as it seemed to be the most compatible to test if Cog could be already used there. Alas, I had no success. In general, Cog was simple to build, import into the GCP's model registry and deploy this Cog-built model to Vertex AI endpoint (deployment is not that much viable on AWS/Azure due to endpoint route names' conflicts with Cog, and the below outlined problems do also apply to AWS and Azure I'd suspect). In summary, the hardcoding on the Cog's side for the predict API becomes incompatible with all the cloud providers seems, including GCP. After reading through and following the end-to-end Vertex AI docs for custom container deployment & prediction requirements (one useful example: https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#prediction), I found that the Vertex Model Endpoints do require the prediction API requests/responses to follow a specific JSON schema, of course incompatible with the Cog's one.

Vertex AI prediction request uses a predefined Schema Object. I've tested using two different schemas as follows:

# Vertex AI requires:
{
  "instances": [
    { "instance_key_1": "value", ... }, ...
  ]
}
# COG #1 attempt:
{
    "input": {
        "image": "https://mypic.jpg"
    }
}
# COG #2 - another attempt to adhere to the GCP's expected schema rather bluntly:
{
    "instances": [{
        "input": {
            "image": "https://mypic.jpg"
        }
    }]
}

Full details on this API's requirements: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/predict

Next, to give an idea of the errors, below I provide responses to those two example attempts. The path to the request file (COG #1 and COG #2 shown above) is exported to a variable INPUT_DATA_FILE and is then sent to a GCP endpoint with a deployed Cog model via curl as follows:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict \
-d "@${INPUT_DATA_FILE}"

Outcomes:

# Response for Cog #1 schema from above
{
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"input\": Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "description": "Invalid JSON payload received. Unknown name \"input\": Cannot find field."
          }
        ]
      }
    ]
  }
}

# Response for Cog #2 schema from above
Internal Server Error

Note, using cog predict is not viable either, as this still violates the Vertex AI's expected request/response JSON schemas.

Nevertheless, I do hope that these evaluations of the current state of Cog were helpful at least to some extent to be able to reason about the future directions of Cog's development - e.g. if there is ever going to be implemented a support for full compatibility with the current set of cloud providers' requirements (e.g. to reach the critical interest from business users to start leveraging Cog in production) or instead make it the best usable tool for non-cloud deployments, etc.

Would be nice to get pinged once the development in the directions of compatibility with cloud providers would be underway, as I'd be able to help out with this I suppose, but currently I think that Cog still has a few other bridges to cross first.

enricorotundo commented 1 year ago

hi @bfirsh 👋 are there any updates on this topic? I'd be interested in deploying cog images on Sagemaker

simonMoisselin commented 12 months ago

Same!

replicate / cog

Configurable Cog endpoint URL and predict.py function names for compatibility with AWS, Azure, ... cloud providers #604

GCP

Azure

AWS