Closed pmbaumgartner closed 7 months ago
Just adding one more piece of context here: I notice the characters won\'t
are included in the prompt through the template - the phrase that gets injected is While you can limit the sharing with other banks/insurance companies/service providers so that you won\'t get offers from them based on the data shared by the bank, you cannot limit the credit reports themselves.
. I'm guessing there is something at the LLM level that is replicating this sequence of characters and then something is failing at the JSON generation.
In case it's additional help, here's is the output of json.dumps
on this phrase:
In [27]: print(json.dumps(r"While you can limit the sharing with other banks/insurance companies/service providers so that you won\'t get offer
...: s from them based on the data shared by the bank, you cannot limit the credit reports themselves."))
"While you can limit the sharing with other banks/insurance companies/service providers so that you won\\'t get offers from them based on the data shared by the bank, you cannot limit the credit reports themselves."
I'm also seeing the same issue (mostly with Mistral models).. I think this is a hard one to solve, but ideally JSON decoding should prevent incorrect use of escape characters.
I also see this the most with Mistral models. Other's I've evaluated against are Hermes-2-Pro-Mistral-7B
, alphamonarch-7b
, openhermes-2.5-neural-chat-v3-3-slerp
, and mistral-7b-instruct-v0.2
- the last one having this issue most frequently.
Have you tried different white space patterns?
I haven't with this specific problem, but I will give it a shot. Though I have to say it's not clear to me how it would help with this specific issue, since that would modify the whitespace but not prevent it from generating JSON with invalid escape characters - unless I'm missing something.
Here is a smaller replicable example.
import outlines
from pydantic import BaseModel
class Input(BaseModel):
value: str
kwargs = {"n_ctx": 0, "max_tokens": 0, "n_gpu_layers": -1, "verbose": False}
model = outlines.models.llamacpp(
"models/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
**kwargs,
)
generator = outlines.generate.json(model, Input)
prompt = r"""You are a helpful assistant. Your task is to return a given input word in JSON format.
Return the following value in JSON:
{"value": "won\\'t"}
"""
for _ in range(20):
result = generator(prompt)
Should fail with the following exception:
ValueError: Error formatting sequences: 1 validation error for Input
__root__
Invalid \escape: line 2 column 16 (char 17) [type=value_error.jsondecode, input_value='{\n "value": "won\\\'t"\n}', input_type=str]
This is a more general problem with the regexes we use I think.
Is there any update on this yet? I've also encountered this problem with structured generation using pydantic. All of our used models ("mistral-7b-instruct-v0.2", "mistralai/Mixtral-8x7B-v0.1" quantized and llama-2-7Bf, llama-2-13Bf, llama-2-70Bf quantized) are affected. So far, I was not able to track down any inputs which definitely lead to a faulty output.
Also for me, same error.
Same error for me. Any non trivial generation is likely to fail with Mistral 7B Instruct v0.2
Here's an example that failed for me after 5 retries.
Using outlines via vLLM openAI server.
The regex that is used to describe valid characters allows the generation of odd number of escape characters. This should be fixed by #829
Thanks a lot @rlouf! Do you know when you'll release this? Want to open a PR in vLLM to update the deps.
Did you try the code in main
?
No I haven't., we use outlines via the vLLM openAPI server. I can set up a repro script with outlines directly but that might have to wait for next week. Can report back when I have done that.
Describe the issue as clearly as possible:
Occasionally when I use outlines it will return a JSON string with invalid JSON. This happens most often when it generates an invalid escape character.
This is fairly hard to replicate because the frequency of this issue depends on the model and the prompt. The example code I have below generated this error on the 3rd iteration of the loop when I ran it, but now I'm trying to replicate it again and can't get it to happen.
I monkey-patched
models/llamacpp.py
to print out the offending string when there's a parse error. Here is an example of JSON that fails to parse:Using the traceback and an online JSON parser, I think the issue occurs with the generation of the substring
\\\'
starting around character 343.You can replicate this specific example (with the models in the code snippet below) by attempting to parse an object like this:
This results in the same validation error I get with the escape character.
And a valid version, just for reference:
My apologies for the long code below for replicating - obviously not all of it is required to generate this specific issue, but I wanted to include everything I am doing in this instance that's generating invalid JSON.
Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
python -c "import sys; print('Python', sys.version)" pip freeze 0.0.36 Python 3.10.10 (main, Jun 19 2023, 11:34:34) [Clang 14.0.0 (clang-1400.0.29.202)] accelerate==0.28.0 aiohttp==3.9.3 aiosignal==1.3.1 altair==5.2.0 annotated-types==0.6.0 anyio==4.3.0 appdirs==1.4.4 appnope==0.1.4 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==2.4.1 async-lru==2.0.4 async-timeout==4.0.3 attrs==23.2.0 Babel==2.14.0 beautifulsoup4==4.12.3 bleach==6.1.0 boto3==1.34.63 botocore==1.34.63 bpemb==0.3.4 certifi==2024.2.2 cffi==1.16.0 charset-normalizer==3.3.2 click==8.1.7 cloudpickle==3.0.0 comm==0.2.2 conllu==4.5.3 contourpy==1.2.0 cycler==0.12.1 dataclasses-json==0.6.4 datasets==2.18.0 debugpy==1.8.1 decorator==5.1.1 defusedxml==0.7.1 Deprecated==1.2.14 dill==0.3.8 diskcache==5.6.3 distro==1.9.0 docstring-parser==0.15 exceptiongroup==1.2.0 executing==2.0.1 fastapi==0.110.0 fastjsonschema==2.19.1 filelock==3.13.1 flair==0.13.1 fonttools==4.49.0 fqdn==1.5.1 frozenlist==1.4.1 fsspec==2024.2.0 ftfy==6.1.3 gdown==5.1.0 gensim==4.3.2 h11==0.14.0 httpcore==1.0.4 httpx==0.27.0 huggingface-hub==0.21.4 idna==3.6 instructor==0.6.4 interegular==0.3.3 ipykernel==6.29.3 ipython==8.22.2 ipywidgets==8.1.2 isoduration==20.11.0 Janome==0.5.0 jedi==0.19.1 Jinja2==3.1.3 jmespath==1.0.1 joblib==1.3.2 json5==0.9.24 jsonpatch==1.33 jsonpointer==2.4 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 jupyter==1.0.0 jupyter-console==6.6.3 jupyter-events==0.9.1 jupyter-lsp==2.2.4 jupyter_client==8.6.1 jupyter_core==5.7.2 jupyter_server==2.13.0 jupyter_server_terminals==0.5.3 jupyterlab==4.1.5 jupyterlab_pygments==0.3.0 jupyterlab_server==2.25.4 jupyterlab_widgets==3.0.10 kiwisolver==1.4.5 langchain==0.1.12 langchain-community==0.0.28 langchain-core==0.1.32 langchain-openai==0.0.8 langchain-text-splitters==0.0.1 langdetect==1.0.9 langsmith==0.1.27 lark==1.1.9 llama_cpp_python==0.2.56 llvmlite==0.42.0 lxml==5.1.0 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.1 matplotlib==3.8.3 matplotlib-inline==0.1.6 mdurl==0.1.2 mistune==3.0.2 more-itertools==10.2.0 mpld3==0.5.10 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.16 mypy-extensions==1.0.0 nbclient==0.10.0 nbconvert==7.16.2 nbformat==5.10.3 nest-asyncio==1.6.0 networkx==3.2.1 notebook==7.1.2 notebook_shim==0.2.4 numba==0.59.0 numpy==1.26.4 openai==1.13.3 orjson==3.9.15 outlines==0.0.36 overrides==7.7.0 packaging==23.2 pandas==2.2.1 pandocfilters==1.5.1 parso==0.8.3 pexpect==4.9.0 pillow==10.2.0 platformdirs==4.2.0 pptree==3.1 prometheus_client==0.20.0 prompt-toolkit==3.0.43 protobuf==5.26.0 psutil==5.9.8 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==15.0.1 pyarrow-hotfix==0.6 pycparser==2.21 pydantic==2.6.3 pydantic-settings==2.2.1 pydantic_core==2.16.3 Pygments==2.17.2 pyparsing==3.1.2 pysbd==0.3.4 PySocks==1.7.1 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-json-logger==2.0.7 pytorch_revgrad==0.2.0 pytz==2024.1 PyYAML==6.0.1 pyzmq==25.1.2 qtconsole==5.5.1 QtPy==2.4.1 ragas==0.1.4 referencing==0.33.0 regex==2023.12.25 requests==2.31.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==13.7.1 rpds-py==0.18.0 ruff==0.3.2 s3transfer==0.10.1 safetensors==0.4.2 scikit-learn==1.4.1.post1 scipy==1.12.0 segtok==1.5.11 semver==3.0.2 Send2Trash==1.8.2 sentence-transformers==2.5.1 sentencepiece==0.1.99 seqeval==1.2.2 six==1.16.0 smart-open==7.0.1 sniffio==1.3.1 soupsieve==2.5 SQLAlchemy==2.0.28 sqlitedict==2.1.0 sse-starlette==2.0.0 stack-data==0.6.3 starlette==0.36.3 starlette-context==0.3.6 sympy==1.12 tabulate==0.9.0 tenacity==8.2.3 terminado==0.18.1 threadpoolctl==3.3.0 tiktoken==0.6.0 tinycss2==1.2.1 tokenizers==0.15.2 tomli==2.0.1 toolz==0.12.1 torch==2.2.1 tornado==6.4 tqdm==4.66.2 traitlets==5.14.1 transformer-smaller-training-vocab==0.3.3 transformers==4.38.2 typer==0.9.0 types-python-dateutil==2.9.0.20240316 typing-inspect==0.9.0 typing_extensions==4.10.0 tzdata==2024.1 uri-template==1.3.0 urllib3==1.26.18 uvicorn==0.28.0 wcwidth==0.2.13 webcolors==1.13 webencodings==0.5.1 websocket-client==1.7.0 widgetsnbextension==4.0.10 Wikipedia-API==0.6.0 wrapt==1.16.0 xxhash==3.4.1 yarl==1.9.4
Context for the issue:
No response