hao-ai-lab / LookaheadDecoding

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
https://arxiv.org/abs/2402.02057
Apache License 2.0
1.11k stars 65 forks source link

Tensor parallel #58

Closed wangyuwen1999 closed 3 months ago

wangyuwen1999 commented 5 months ago
截屏2024-04-15 11 13 09

I want to use tensor parallel with lookhead, but I do not find the config to start the tensor parallel, can you give me an example?

Viol2000 commented 4 months ago

Tensor parallel is not compatible with lookhead yet. You can choose only to use tensor parallel (and no lookahead). L28-L35 in your figure is an example if you set GPUS larger.