mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.23k stars 1.58k forks source link

Support of heterogeneous devices #3018

Closed musram closed 1 week ago

musram commented 1 week ago

🚀 Feature

Support for heterogeneous devices

Motivation

Run inference using heterogeneous gpus(eg Shard model on metal and vulkan simultaneously) which will lead to use my house hold devices

Is it possible to share some direction to implement in MLC LLM?

@tqchen @junrushao @MasterJH5574

MasterJH5574 commented 1 week ago

Hi @musram thank you for the question. For now we don't support running on heterogeneous GPUs. We will work on introducing REST API level protocols to bring different servers (that may run on different GPUs) together.