This PR moves away from outlines and back to guidance for the underlying constrained decoding. This improves runtime of executing BlendSQL queries, and allows us to execute ingredients with greater guarantees on the output format by interleaving text + generations.
Additional changes:
Moved to pyproject.toml format (away from setuptools.py)
Added AnthropicLLM, removed LlamaCpp
From now on, I use the HuggingFaceTB/SmolLM-135M model for benchmarks.
This PR moves away from outlines and back to guidance for the underlying constrained decoding. This improves runtime of executing BlendSQL queries, and allows us to execute ingredients with greater guarantees on the output format by interleaving text + generations.
Additional changes:
AnthropicLLM
, removedLlamaCpp
From now on, I use the
HuggingFaceTB/SmolLM-135M
model for benchmarks.