vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
23.16k stars 3.29k forks source link

[Doc]: Urgent MoE question #5413

Closed ymmm-4 closed 1 month ago

ymmm-4 commented 1 month ago

📚 The doc issue

I am confused about the MoE layer in Jamba block. There are many versions of MoE. The paper has not defined in detail the mathematics or diagrams to understand the expert system. Can you please guide or share exact paper which has been followed in jamba?

Suggest a potential alternative/fix

No response

mgoin commented 1 month ago

Hi @ymmm-4 Jamba is not supported in vLLM yet, so we aren't a great resource for this detail. Maybe you could take a look at the draft PR in progress to implement Jamba? https://github.com/vllm-project/vllm/pull/4115

ymmm-4 commented 1 month ago

I would like to know about MoE layer only. There is no description of experts, equations, gating mechanism, sparsity etc. I want to understand the MoE. Can you please explain me

On Tue, 11 Jun 2024 at 22:18, Michael Goin @.***> wrote:

Hi @ymmm-4 https://github.com/ymmm-4 Jamba is not supported in vLLM yet, so we aren't a great resource for this detail. Maybe you could take a look at the draft PR in progress to implement Jamba? #4115 https://github.com/vllm-project/vllm/pull/4115

— Reply to this email directly, view it on GitHub https://github.com/vllm-project/vllm/issues/5413#issuecomment-2160887191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGCVPKRM7T5NK6JAWGGRT5DZG4BLBAVCNFSM6AAAAABJDSDMIGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQHA4DOMJZGE . You are receiving this because you were mentioned.Message ID: @.***>

-- Kind Regards, Engr. Yumna Memon (MPHIL SOFTWARE ENGINEERING, B.E. COMPUTER SYSTEMS ENGINEERING) Lecturer - FEST | EDP Coordinator - EDP Department EDP Department | Secretariat Office IQRA University | North Campus 021-36723221-6 (Ext 1208) @. @.>*