Closed neubig closed 8 months ago
For locally hosted models from Hugging Face, it would be good to support multi-GPU inference, including:
Currently inference is handled using the Hugging Face provider: https://github.com/zeno-ml/zeno-build/blob/23d30803bf27d5669ab666b5f05c95f5283b780b/zeno_build/models/providers/huggingface_utils.py#L13
Any code to support multi-GPU inference would have to be added there. Contributions are welcome!
For locally hosted models from Hugging Face, it would be good to support multi-GPU inference, including:
Currently inference is handled using the Hugging Face provider: https://github.com/zeno-ml/zeno-build/blob/23d30803bf27d5669ab666b5f05c95f5283b780b/zeno_build/models/providers/huggingface_utils.py#L13
Any code to support multi-GPU inference would have to be added there. Contributions are welcome!