entrpn / serving-model-cards

Collection of OSS models that are containerized into a serving container
Apache License 2.0
15 stars 1 forks source link

Creating a Vertex AI endpoint of pre-trained model for Inferencing. #9

Open StateGovernment opened 1 year ago

StateGovernment commented 1 year ago

Labels : How to/ Suggestions.

I am looking for suggestions on, how to create a Vertex AI endpoint out of a Trained Dreambooth stable diffusion model. Inorder to run inference on the model through endpoint. How do I go about this

entrpn commented 1 year ago

@StateGovernment take a look at this example. This is pytorch only, so you'll have to convert the model to pytorch as demonstrated in the training dreambooth repo.

In your case, where you are using a local model, you need to modify the Dockerfile to copy your model. Add a new line here

COPY dreambooth-folder .

Then call the docker build like:

PROJECT_ID=<project-id>
docker build -t gcr.io/$PROJECT_ID/serving-sd:latest --build-arg model_name=dreambooth-folder --build-arg use_xformers=1 --build-arg model_revision=main.

use_xformers only works with fp16 weights. If you have a fp32 model, then use use_xformers=0.

I think that should work.

StateGovernment commented 1 year ago

Thank you for the suggestion.

I see, following the code I believe a REST endpoint is being deployed using FastAPI through Docker. But our use-case actually involves creating a pure vertex ai Endpoint that could serve inference requests, is there a way to deploy the model to a vertex Endpoint instead of FastAPI?

Ref: https://console.cloud.google.com/vertex-ai/endpoints

entrpn commented 1 year ago

This is a deployment pattern for Vertex endpoints using custom containers. The readme describes how to deploy it to the endpoint as described in the Vertex AI documentation https://cloud.google.com/vertex-ai/docs/predictions/use-custom-container