Is Subgraph Heterogeneous Compute Available in MLLM?

UbiquitousLearning / mllm

Fast Multimodal LLM on Mobile Devices

https://ubiquitouslearning.github.io/mllm_website

MIT License

489 stars 57 forks source link

Is Subgraph Heterogeneous Compute Available in MLLM? #97

Closed MaTwickenham closed 1 week ago

MaTwickenham commented 3 months ago

Hello author, I am doing research on LLM heterogeneous computation. When I was browsing the code, I noticed that MLLM's Net class has some content about subgraph. My question is, since MLLM already supports different computing backends such as cpu/gpu/npu, can users assign different backend delegates for different subgraphs of a Net? If so, do you have any docs/scripts for me to refer to? Thanks!

yirongjie commented 3 months ago

Thank you for your attention to mllm. I'm sorry that mllm does not currently support Heterogeneous Compute.

oreomaker commented 2 months ago

We have preliminary supported CPU-NPU compute, the modeling code is in examples/main_qwen_npu.hpp. We currently use a direct way to assign different nets to specific backends manually. The automatic backend assignment and memory management across backends is still to come.