issues
search
AutonomicPerfectionist
/
PipeInfer
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
MIT License
10
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Do you support multiple GPUs to run pipeline speculative decoding?
#1
chenwenyan
opened
2 months ago
1