outlines-dev / outlines

Structured Text Generation
https://outlines-dev.github.io/outlines/
Apache License 2.0
6.96k stars 358 forks source link

Handle Medusa speculative decoding also with outline #845

Open jqueguiner opened 2 months ago

jqueguiner commented 2 months ago

What behavior of the library made you think about the improvement?

As of now Medusa is generating hallucinations as the speculative multihead is not supporting the outline decoding grammar.

How would you like it to behave?

Support speculative decoding for performance reasons

Note: only tgi is supporting Medusa not vllm for now but planned.

furlat commented 1 month ago

Do you know if the n-gram speculation is working? I think that would be even more impactful and simpler to handle since a lot of structured task are rewrite