docker run --gpus all -p 8000:8000 outlinesdev/outlines --model="microsoft/Phi-3-mini-4k-instruct"
Running the example prompt of:
curl http://127.0.0.1:8000/generate \
-d '{
"prompt": "What is the capital of France?",
"schema": {"type": "string", "maxLength": 5}
}'
The response that comes back is:
{"text":["What is the capital of France?\", \""]}
I also tried updating the schema to be a full object:
curl http://127.0.0.1:8000/generate -d '{"prompt": "What is the capital of France?","schema": {"type": "object", "properties":{"capital": {"type":"string", "description": "The capital that was requested"}}}}'
This returns the expected answer, in an expected JSON structure - except the JSON is stringified inside an object with an array of text elements. This is the output data:
{"text":["What is the capital of France?{ \"capital\": \"Paris\" }"]}
Steps/code to reproduce the bug:
Started container with:
docker run --gpus all -p 8000:8000 outlinesdev/outlines --model="microsoft/Phi-3-mini-4k-instruct"
Running the example prompt of:
curl http://127.0.0.1:8000/generate \
-d '{
"prompt": "What is the capital of France?",
"schema": {"type": "string", "maxLength": 5}
}'
Result:
{"text":["What is the capital of France?\", \""]}
Alternatively, reproduce it with a JSON object:
curl http://127.0.0.1:8000/generate -d '{"prompt": "What is the capital of France?","schema": {"type": "object", "properties":{"capital": {"type":"string", "description": "The capital that was requested"}}}}'
This returns:
{"text":["What is the capital of France?{ \"capital\": \"Paris\" }"]}
Expected result:
Expected to see `Paris` or in the second example:
{"capital": "Paris"}
Error message:
No response
Outlines/Python version information:
Version information
```
Docker image outlinesdev/outlines:latest sha256:523381fa1d10b1a9b5124b241952140712af2105fcf841473ff7e1a3a125bad7
Maps to version 0.1.3
ENV PYTHON_VERSION=3.10.15
Startup log: Initializing an LLM engine (v0.5.1)
```
Tried this both on Windows 11 with WSL 2 and Docker Desktop, and a clean Debian 12 VM
Describe the issue as clearly as possible:
Using the dockerized example from https://dottxt-ai.github.io/outlines/latest/reference/serve/vllm/e:
Started container with:
Running the example prompt of:
The response that comes back is:
I also tried updating the schema to be a full object:
This returns the expected answer, in an expected JSON structure - except the JSON is stringified inside an object with an array of text elements. This is the output data:
Steps/code to reproduce the bug:
Running the example prompt of:
Result:
Alternatively, reproduce it with a JSON object:
This returns:
Expected result:
Error message:
No response
Outlines/Python version information:
Version information
Tried this both on Windows 11 with WSL 2 and Docker Desktop, and a clean Debian 12 VM
Context for the issue:
No response