Closed lemig closed 1 month ago
I believe this is #1173. vLLM has a relatively low max_tokens
default, which can cause decode errors due to early termination.
You should be able to run
from pydantic import BaseModel
from outlines import models, generate
class User(BaseModel):
name: str
last_name: str
id: int
model = models.vllm(
"microsoft/Phi-3-mini-4k-instruct",
tensor_parallel_size=4
)
generator = generate.json(model, User)
print("generator OK")
result = generator(
"Create a user profile with the fields name, last_name and id",
max_tokens=30000 # this determines your maximum tokens
)
print(result)
Thanks that works!
max_tokens params solves issue
FYI, when using VLLM server, you can add {"max_tokens": 1024} in the Curl request or python requests.
Describe the issue as clearly as possible:
I have tried your Pydantic example from: https://dottxt-ai.github.io/outlines/latest/reference/generation/json/
I works OK, as is, with:
model = models.transformers("microsoft/Phi-3-mini-4k-instruct")
Also OK:
model = models.transformers("microsoft/Phi-3-mini-4k-instruct", device="cuda")
But JSONDecodeError with:
model = models.vllm("microsoft/Phi-3-mini-4k-instruct", tensor_parallel_size=4)
Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
python -c "from outlines import _version; print(_version.version)" python -c "import sys; print('Python', sys.version)" pip freeze 0.0.46 Python 3.11.10 (main, Oct 3 2024, 07:29:13) [GCC 11.2.0] accelerate==1.0.1 aiohappyeyeballs==2.4.3 aiohttp==3.10.10 aiosignal==1.3.1 airportsdata==20241001 annotated-types==0.7.0 anyio @ file:///home/conda/feedstock_root/build_artifacts/anyio_1728935693959/work argon2-cffi @ file:///home/conda/feedstock_root/build_artifacts/argon2-cffi_1692818318753/work argon2-cffi-bindings @ file:///home/conda/feedstock_root/build_artifacts/argon2-cffi-bindings_1725356582126/work arrow @ file:///home/conda/feedstock_root/build_artifacts/arrow_1696128962909/work asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1698341106958/work async-lru @ file:///home/conda/feedstock_root/build_artifacts/async-lru_1690563019058/work attrs @ file:///home/conda/feedstock_root/build_artifacts/attrs_1722977137225/work Babel @ file:///home/conda/feedstock_root/build_artifacts/babel_1702422572539/work beautifulsoup4 @ file:///home/conda/feedstock_root/build_artifacts/beautifulsoup4_1705564648255/work bleach @ file:///home/conda/feedstock_root/build_artifacts/bleach_1696630167146/work Brotli @ file:///home/conda/feedstock_root/build_artifacts/brotli-split_1725267488082/work cached-property @ file:///home/conda/feedstock_root/build_artifacts/cached_property_1615209429212/work certifi @ file:///home/conda/feedstock_root/build_artifacts/certifi_1725278078093/work/certifi cffi @ file:///home/conda/feedstock_root/build_artifacts/cffi_1725560564262/work charset-normalizer @ file:///home/conda/feedstock_root/build_artifacts/charset-normalizer_1728479282467/work click==8.1.7 cloudpickle==3.1.0 comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1710320294760/work datasets==3.0.1 debugpy @ file:///home/conda/feedstock_root/build_artifacts/debugpy_1728594126643/work decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work defusedxml @ file:///home/conda/feedstock_root/build_artifacts/defusedxml_1615232257335/work dill==0.3.8 diskcache==5.6.3 distro==1.9.0 dnspython==2.7.0 einops==0.8.0 email_validator==2.2.0 entrypoints @ file:///home/conda/feedstock_root/build_artifacts/entrypoints_1643888246732/work exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1720869315914/work executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1725214404607/work fastapi==0.115.2 fastjsonschema @ file:///home/conda/feedstock_root/build_artifacts/python-fastjsonschema_1718477020893/work/dist filelock==3.16.1 fqdn @ file:///home/conda/feedstock_root/build_artifacts/fqdn_1638810296540/work/dist frozenlist==1.4.1 fsspec==2024.6.1 gguf==0.10.0 guidance==0.1.16 h11 @ file:///home/conda/feedstock_root/build_artifacts/h11_1664132893548/work h2 @ file:///home/conda/feedstock_root/build_artifacts/h2_1634280454336/work hpack==4.0.0 httpcore @ file:///home/conda/feedstock_root/build_artifacts/httpcore_1727820890233/work httptools==0.6.2 httpx @ file:///home/conda/feedstock_root/build_artifacts/httpx_1724778349782/work huggingface-hub==0.25.2 hyperframe @ file:///home/conda/feedstock_root/build_artifacts/hyperframe_1619110129307/work idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1726459485162/work importlib_metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1726082825846/work importlib_resources @ file:///home/conda/feedstock_root/build_artifacts/importlib_resources_1725921340658/work interegular==0.3.3 ipykernel @ file:///croot/ipykernel_1728665589812/work ipython @ file:///home/conda/feedstock_root/build_artifacts/ipython_1727944696411/work ipython_genutils @ file:///home/conda/feedstock_root/build_artifacts/ipython_genutils_1716278396992/work ipywidgets @ file:///home/conda/feedstock_root/build_artifacts/ipywidgets_1724334859652/work isoduration @ file:///home/conda/feedstock_root/build_artifacts/isoduration_1638811571363/work/dist jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1696326070614/work Jinja2 @ file:///home/conda/feedstock_root/build_artifacts/jinja2_1715127149914/work jiter==0.6.1 json5 @ file:///home/conda/feedstock_root/build_artifacts/json5_1712986206667/work jsonpointer @ file:///home/conda/feedstock_root/build_artifacts/jsonpointer_1725302941992/work jsonschema @ file:///home/conda/feedstock_root/build_artifacts/jsonschema_1720529478715/work jsonschema-specifications @ file:///tmp/tmpvslgxhz5/src jupyter==1.1.1 jupyter-console==6.6.3 jupyter-contrib-core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_contrib_core_1657548529421/work jupyter-contrib-nbextensions @ file:///home/conda/feedstock_root/build_artifacts/jupyter_contrib_nbextensions_1670068802953/work jupyter-events @ file:///home/conda/feedstock_root/build_artifacts/jupyter_events_1710805637316/work jupyter-highlight-selected-word @ file:///home/conda/feedstock_root/build_artifacts/jupyter_highlight_selected_word_1695322379939/work jupyter-latex-envs @ file:///home/conda/feedstock_root/build_artifacts/jupyter_latex_envs_1614852190293/work jupyter-lsp @ file:///home/conda/feedstock_root/build_artifacts/jupyter-lsp-meta_1712707420468/work/jupyter-lsp jupyter-nbextensions-configurator @ file:///home/conda/feedstock_root/build_artifacts/jupyter_nbextensions_configurator_1670793770953/work jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1673615989977/work jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1727163409502/work jupyter_server @ file:///home/conda/feedstock_root/build_artifacts/jupyter_server_1720816649297/work jupyter_server_terminals @ file:///home/conda/feedstock_root/build_artifacts/jupyter_server_terminals_1710262634903/work jupyterlab @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_1724745148804/work jupyterlab_pygments @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_pygments_1707149102966/work jupyterlab_server @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_server-split_1721163288448/work jupyterlab_widgets @ file:///home/conda/feedstock_root/build_artifacts/jupyterlab_widgets_1724331334887/work lark==1.2.2 llvmlite==0.43.0 lm-format-enforcer==0.10.6 lxml @ file:///croot/lxml_1722882187815/work MarkupSafe @ file:///home/conda/feedstock_root/build_artifacts/markupsafe_1728489060918/work matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1713250518406/work mistral_common==1.4.4 mistune @ file:///home/conda/feedstock_root/build_artifacts/mistune_1698947099619/work mpmath==1.3.0 msgpack==1.1.0 msgspec==0.18.6 multidict==6.1.0 multiprocess==0.70.16 nbclassic @ file:///home/conda/feedstock_root/build_artifacts/nbclassic_1716838762700/work nbclient @ file:///home/conda/feedstock_root/build_artifacts/nbclient_1710317608672/work nbconvert @ file:///home/conda/feedstock_root/build_artifacts/nbconvert-meta_1718135430380/work nbformat @ file:///home/conda/feedstock_root/build_artifacts/nbformat_1712238998817/work nest_asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1705850609492/work networkx==3.4.1 notebook @ file:///home/conda/feedstock_root/build_artifacts/notebook_1715848908871/work notebook_shim @ file:///home/conda/feedstock_root/build_artifacts/notebook-shim_1707957777232/work numba==0.60.0 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==9.1.0.70 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-ml-py==12.560.30 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.6.77 nvidia-nvtx-cu12==12.1.105 openai==1.51.2 opencv-python-headless==4.10.0.84 ordered-set==4.1.0 outlines==0.0.46 outlines_core==0.1.14 overrides @ file:///home/conda/feedstock_root/build_artifacts/overrides_1706394519472/work packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1718189413536/work pandas==2.2.3 pandocfilters @ file:///home/conda/feedstock_root/build_artifacts/pandocfilters_1631603243851/work parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1712320355065/work partial-json-parser==0.2.1.1.post4 pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1706113125309/work pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work pillow==10.4.0 pkgutil_resolve_name @ file:///home/conda/feedstock_root/build_artifacts/pkgutil-resolve-name_1694617248815/work platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1726613481435/work prometheus-fastapi-instrumentator==7.0.0 prometheus_client @ file:///home/conda/feedstock_root/build_artifacts/prometheus_client_1726901976720/work prompt_toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1727341649933/work propcache==0.2.0 protobuf==5.28.2 psutil @ file:///home/conda/feedstock_root/build_artifacts/psutil_1728965152023/work ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl pure_eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1721585709575/work py-cpuinfo==9.0.0 pyairports==2.1.1 pyarrow==17.0.0 pycountry==24.6.1 pycparser @ file:///home/conda/feedstock_root/build_artifacts/pycparser_1711811537435/work pydantic==2.9.2 pydantic_core==2.23.4 Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1714846767233/work PySocks @ file:///home/conda/feedstock_root/build_artifacts/pysocks_1661604839144/work python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-json-logger @ file:///home/conda/feedstock_root/build_artifacts/python-json-logger_1677079630776/work pytz @ file:///home/conda/feedstock_root/build_artifacts/pytz_1726055524169/work PyYAML @ file:///home/conda/feedstock_root/build_artifacts/pyyaml_1725456139051/work pyzmq @ file:///home/conda/feedstock_root/build_artifacts/pyzmq_1728642222605/work ray==2.37.0 referencing @ file:///home/conda/feedstock_root/build_artifacts/referencing_1714619483868/work regex==2024.9.11 requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1717057054362/work rfc3339-validator @ file:///home/conda/feedstock_root/build_artifacts/rfc3339-validator_1638811747357/work rfc3986-validator @ file:///home/conda/feedstock_root/build_artifacts/rfc3986-validator_1598024191506/work rpds-py @ file:///home/conda/feedstock_root/build_artifacts/rpds-py_1725327039958/work safetensors==0.4.5 Send2Trash @ file:///home/conda/feedstock_root/build_artifacts/send2trash_1712584999685/work sentencepiece==0.2.0 six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work sniffio @ file:///home/conda/feedstock_root/build_artifacts/sniffio_1708952932303/work soupsieve @ file:///home/conda/feedstock_root/build_artifacts/soupsieve_1693929250441/work stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work starlette==0.40.0 sympy==1.13.3 terminado @ file:///home/conda/feedstock_root/build_artifacts/terminado_1710262609923/work tiktoken==0.7.0 tinycss2 @ file:///home/conda/feedstock_root/build_artifacts/tinycss2_1713974937325/work tokenizers==0.20.1 tomli @ file:///home/conda/feedstock_root/build_artifacts/tomli_1727974628237/work torch==2.4.0 torchvision==0.19.0 tornado @ file:///home/conda/feedstock_root/build_artifacts/tornado_1724956126282/work tqdm==4.66.5 traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1713535121073/work transformers==4.45.2 triton==3.0.0 types-python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/types-python-dateutil_1727940235703/work typing-utils @ file:///home/conda/feedstock_root/build_artifacts/typing_utils_1622899189314/work typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1717802530399/work tzdata==2024.2 uri-template @ file:///home/conda/feedstock_root/build_artifacts/uri-template_1688655812972/work/dist urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1726496430923/work uvicorn==0.32.0 uvloop==0.21.0 vllm==0.6.3 watchfiles==0.24.0 wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1704731205417/work webcolors @ file:///home/conda/feedstock_root/build_artifacts/webcolors_1723294704277/work webencodings @ file:///home/conda/feedstock_root/build_artifacts/webencodings_1694681268211/work websocket-client @ file:///home/conda/feedstock_root/build_artifacts/websocket-client_1713923384721/work websockets==13.1 widgetsnbextension @ file:///home/conda/feedstock_root/build_artifacts/widgetsnbextension_1724331337528/work xformers==0.0.27.post2 xxhash==3.5.0 yarl==1.15.3 zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1726248574750/work zstandard==0.23.0
Context for the issue:
No response