iterative / mlem

🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞
https://mlem.ai
Apache License 2.0
717 stars 44 forks source link

Serve models trained on GPU on CPU, and vice versa #658

Open aguschin opened 1 year ago

aguschin commented 1 year ago

Right now if you train a model on GPU, save it with MLEM, but then try to load/serve it on CPU, it simply breaks. The only workaround that exists now is to convert the model to CPU before saving it. We need to make this work:

This extends not only to serving model locally, but also to deploying - for example, fly don't have GPUs, so even if you managed to deploy the model, it'll break there.

Vice versa, if the model was trained on CPU, but you want to make it serve it on GPU, MLEM should give a way to do this. Special case would be if you want to load_meta your model (along with pre/post-processors), then you work with MlemModel object (not PyTorch model you can get at load) and you need a way to specify the device to run it on.