Closed MaTwickenham closed 1 week ago
Thank you for your attention to mllm. I'm sorry that mllm does not currently support Heterogeneous Compute.
We have preliminary supported CPU-NPU compute, the modeling code is in examples/main_qwen_npu.hpp. We currently use a direct way to assign different nets to specific backends manually. The automatic backend assignment and memory management across backends is still to come.
Hello author, I am doing research on LLM heterogeneous computation. When I was browsing the code, I noticed that MLLM's Net class has some content about subgraph. My question is, since MLLM already supports different computing backends such as cpu/gpu/npu, can users assign different backend delegates for different subgraphs of a Net? If so, do you have any docs/scripts for me to refer to? Thanks!