Open Xompute opened 2 years ago
Thanks for the suggestion @Xompute! This is totally something we want to make possible.
Making an API that is compatible with every possible infrastructure provider is likely going to get messy and hard to maintain. One thing we've been thinking about is having some concept of an adapter for using Cog models with different providers. Kind of like we have the Redis queue server.
They could either be part of Cog core, or maintained as separate libraries somehow. You could imagine installing a cog-sagemaker
Python package in your model, then that could be configured to be the interface, somehow.
I'd assume that cloud providers won't change their APIs anytime soon (Azure most likely will as they are still in Beta), as all customers would need to migrate/upgrade - hence there is some stability guarantee. But anyway, based on my limited research, I'd instead suggest (if at all viable) to aim for full customizability on the Cog's end for at the minimum:
predict.py
script's function names - setup()
& predict()
;HTTP request/response JSON schemas
(current situation elaborated below).I decided to give Cog a ride at least on the GCP, as it seemed to be the most compatible to test if Cog could be already used there. Alas, I had no success.
In general, Cog was simple to build, import into the GCP's model registry and deploy this Cog-built model to Vertex AI endpoint (deployment is not that much viable on AWS/Azure due to endpoint route names' conflicts with Cog, and the below outlined problems do also apply to AWS and Azure I'd suspect).
In summary, the hardcoding on the Cog's side for the predict
API becomes incompatible with all the cloud providers seems, including GCP.
After reading through and following the end-to-end Vertex AI docs for custom container deployment & prediction requirements (one useful example: https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#prediction), I found that the Vertex Model Endpoints do require the prediction API requests/responses to follow a specific JSON schema, of course incompatible with the Cog's one.
Vertex AI prediction request uses a predefined Schema Object. I've tested using two different schemas as follows:
# Vertex AI requires:
{
"instances": [
{ "instance_key_1": "value", ... }, ...
]
}
# COG #1 attempt:
{
"input": {
"image": "https://mypic.jpg"
}
}
# COG #2 - another attempt to adhere to the GCP's expected schema rather bluntly:
{
"instances": [{
"input": {
"image": "https://mypic.jpg"
}
}]
}
Full details on this API's requirements: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/predict
Next, to give an idea of the errors, below I provide responses to those two example attempts. The path to the request file (COG #1
and COG #2
shown above) is exported to a variable INPUT_DATA_FILE
and is then sent to a GCP endpoint with a deployed Cog model via curl as follows:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict \
-d "@${INPUT_DATA_FILE}"
Outcomes:
# Response for Cog #1 schema from above
{
"error": {
"code": 400,
"message": "Invalid JSON payload received. Unknown name \"input\": Cannot find field.",
"status": "INVALID_ARGUMENT",
"details": [
{
"@type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"description": "Invalid JSON payload received. Unknown name \"input\": Cannot find field."
}
]
}
]
}
}
# Response for Cog #2 schema from above
Internal Server Error
Note, using cog predict
is not viable either, as this still violates the Vertex AI's expected request/response JSON schemas.
Nevertheless, I do hope that these evaluations of the current state of Cog were helpful at least to some extent to be able to reason about the future directions of Cog
's development - e.g. if there is ever going to be implemented a support for full compatibility with the current set of cloud providers' requirements (e.g. to reach the critical interest from business users to start leveraging Cog in production) or instead make it the best usable tool for non-cloud deployments, etc.
Would be nice to get pinged once the development in the directions of compatibility with cloud providers would be underway, as I'd be able to help out with this I suppose, but currently I think that Cog still has a few other bridges to cross first.
hi @bfirsh 👋 are there any updates on this topic? I'd be interested in deploying cog images on Sagemaker
Same!
Cog's 1) endpoint names (ex.
/predictions
in URLs) and the 2)predict.py
entry function names (setup()
andpredict()
) must be customizable/possible to be renamed (names only, however, currently their names are hardcoded) - because Cog is otherwise unusable by some cloud providers due to their own deployment requirements as they each differ in their ML docker inferencing implementation designs.This could be achieved via some refactoring, but must be thought through properly. Do you plan to address this within the scope of the Cog project? If yes, what would be the roadmap/plan/priority?
This functionality is a deal-breaker for production/business usage on hyperscaler AI infra outside of Replicate/custom infra deployments.
Legend: ✅= is compatible; ❌ isn't compatible with Cog as of today
GCP
endpoint names ✅,
predict.py
function names ✅predictRoute
API for a confirmation and details: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/ModelContainerSpecAzure
endpoint names ✅,
predict.py
function names ❌predict.py
inference script's function names as follows:AWS
Endpoint names ❌ ,
predict.py
function names ✅