Open berniwal opened 1 month ago
Yes contribution welcomed. However, I believe outlines already have a schema cache nowadays, it might be a better idea to first investigate why that didn't work, or how to get that schema cache working with configurable path
🚀 The feature, motivation and pitch
Problem
I am currently working with structured outputs and experimented a little with VLLM + Outlines. Since our JSON Schemas can get quite complex the generation of the FSM can take around 2 Minutes per Schema. It would be great to have a feature where you can provide a Schema-Store to save your generated schemas over time in a local file and reload them when you restart your deployment. Ideally this would be implemented as flag in the vllm serve arguments:
https://docs.vllm.ai/en/latest/models/engine_args.html
Current Implementation
I assume that this is currently not supported and the code to not recompute the schema is handled with the @cache() decorator here:
Alternatives
Alternative solution would probably be to create custom python code to handle this for my use-case and use the VLLM python functions for generation instead of the "VLLM serve" command. However not sure how you could handle this with the API Deployment.
Additional context
PS: Happy to contribute to this feature if this is something that can be useful to other people / makes also sense for the people who understand the code base better.
Before submitting a new issue...