replicate / replicate-python

Python client for Replicate
https://replicate.com
Apache License 2.0
691 stars 194 forks source link

Deployment functionality #222

Open vishnubob opened 6 months ago

vishnubob commented 6 months ago

I would like to be able to spin up and shutdown deployments from the API. From looking over the API and python client, this doesn’t seem possible. Am I missing something or would it be possible to add this functionality?

Thanks!

mattt commented 5 months ago

Hi, @vishnubob. You're correct that Replicate doesn't currently expose any APIs for managing deployments. However, you can configure your deployment with a min / max number of concurrent predictions to handle, and the autoscaler will spin up and down down model instances based on inbound requests.

vishnubob commented 5 months ago

Hi @mattt, thanks for your response. I am using replicate for an interactive photobooth, so my use case is a bit unusual. Since the installation is temporal, I only need the deployment while the installation is available. In order to reduce any latency, I standup a single node deployment while the installation is available, and spin down the nodes when I strike. However, it's a complicated installation, and I sometimes forget to spin down the deployments during strike, so I end up paying for idle deployments. Being able to automate the deployment from the software would be a huge win.

For now, I have transitioned this part of the project to tailscale which lets me use my own server at home, but if I could automate the deployment, I would switch back to using replicate.