dottxt-ai / outlines

Structured Text Generation
https://dottxt-ai.github.io/outlines/
Apache License 2.0
9.44k stars 479 forks source link

Are DirectMerge and CartesianMerge implemented in outlines? #1084

Open youkaichao opened 3 months ago

youkaichao commented 3 months ago

They are mentioned in this blog https://vivien000.github.io/blog/journal/llm-decoding-with-regex-constraints.html , and they look very helpful.

Dan-wanna-M commented 2 months ago

@youkaichao I think it might not be very relevant with popular sampling strategy? Suppose logits is x, the new probability of token i is exp(x_i)/(sum(exp(x_j))-sum(exp(x_k))) where x_j's are all tokens in a given vocabulary and x_k's are filtered tokens. While the new probability for improper tokens will increase, the new probability for proper token will increase as well. This means with top-p sampling, improper tokens will still be filtered out eventually and does not affect the ultimate result.