bd-iaas-us / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
3 stars 1 forks source link

[Misc]: Finding possible more interesting areas #17

Closed chizhang118 closed 2 months ago

chizhang118 commented 3 months ago

Anything you want to discuss about vllm.

Checking recent papers lists and figure out possible interesting areas to work on.

chizhang118 commented 3 months ago

https://github.com/chizhang118/Awesome-LLM-Inference

thesues commented 3 months ago

prefill and decoding disaggregation requires good performance of PP. other projects such as SwiftTransformer has PP supported. Maybe we could try to tune the PP performance

chizhang118 commented 3 months ago

prefill and decoding disaggregation requires good performance of PP. other projects such as SwiftTransformer has PP supported. Maybe we could try to tune the PP performance

I remembered vllm had not implemented supporting PP yet. You mean adding the function of PP for vllm?