triton-inference-server / pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
https://triton-inference-server.github.io/pytriton/
Apache License 2.0
687 stars 45 forks source link

Support Mac installation #44

Open zbloss opened 7 months ago

zbloss commented 7 months ago

It would be great to be able to install pytriton on Macs for ease-of-development. Even with the lack of CUDA support for Macs, being able to develop using only the CPU would be a real time saver.

As a far reaching stretch goal, being able to use mps on Macs would also be great.

As of now pip install nvidia-pytriton throws an error being unable to install cuda-python on my machine.

My Machine:

sricke commented 7 months ago

As far as I understand, that update should come from triton-inference-server, but since it's supposed to run on an NDVIDIA GPU I don't think it'll happen.

What you can do is run on a docker emulated image running docker image with flag --platform linux/amd64. It will run on CPU and slower but you'll be able to debug.

Hope that helps

jkosek commented 7 months ago

The limitation we have is installation on Linux machine as this is how the binary and libraries are prepared. As @sricke mentioned you may run Ubuntu:22.04 system using Docker and install recently published wheel with aarch64 support which you can find here: https://github.com/triton-inference-server/pytriton/releases/download/v0.4.1/nvidia_pytriton-0.4.1-py3-none-manylinux_2_35_aarch64.whl (publishing to pypi is in progress).

zbloss commented 7 months ago

Thanks for the quick fix, that's what I'll start with.

Is it possible to decouple pytriton from the cuda-python library?

jkosek commented 7 months ago

The cuda-python is required by tritonclient package which is PyTriton dependency. As for now we are not able to remove it, but if you would run installation inside the Docker container with Ubuntu 22.04 there should be no issue despite the CUDA is not available in the system.

zbloss commented 7 months ago

Thanks all!

martin-liu commented 2 months ago

@jkosek It seems cuda-python is optional for tritonclient, is it possible to make it optional in pytriton?

jkosek commented 2 months ago

@martin-liu you are correct. We are going to fix dependencies on our side to avoid cuda dependency at the moment. Thanks!

martin-liu commented 2 months ago

@jkosek, fantastic! I'm truly appreciative of your efforts to resolve this issue. This adjustment will significantly enhance local development experience on Mac. Thank you!

jkosek commented 2 months ago

@martin-liu the change has been applied on the main branch and will be available with next wheel release.

martin-liu commented 2 months ago

@jkosek Great! Thank you very much!

martin-liu commented 1 month ago

@jkosek may I know what's the ETA for next release?

jkosek commented 3 weeks ago

@martin-liu the change has been just released.

martin-liu commented 3 weeks ago

@jkosek Thank you for the update, I'll check out!

martin-liu commented 3 weeks ago

@jkosek Seems there are no wheels available that support macOS. Are there any specific issues or blockers preventing the creation of macOS wheels?

jkosek commented 3 weeks ago

Hey @martin-liu . At the moment we are not able to provide a macOS wheel. The PyTriton contains a Triton binaries which are bounded to the Linux OS and specific version of GLibc. In such case the only option we can suggest is to use Docker image, ex. Ubuntu, on macOS.

martin-liu commented 3 weeks ago

@jkosek got it, thanks