davmacario / MDI-LLM

Implementation of Model-Distributed Inference for Large Language Models, built on top of LitGPT
MIT License
3 stars 2 forks source link

Add possibility to assign different devices to different nodes running on the same host #9

Closed davmacario closed 7 months ago

davmacario commented 7 months ago

This would enable to use MDI to "parallelize" inference on a single host, or at least to use multiple GPUs at the inference stage. Not sure about the actual improvement in performance (torch's "distributed" should be faster in any case), but it wouldn't be too difficult to implement.