dottxt-ai / outlines

Structured Text Generation
https://dottxt-ai.github.io/outlines/
Apache License 2.0
9.68k stars 495 forks source link

Generate.json and generate.regex failing on 0.1.4 #1274

Closed sigjhl closed 3 days ago

sigjhl commented 6 days ago

Describe the issue as clearly as possible:

On 0.1.4, generate.regex and generate.json fails to execute. This is a fresh conda env with only python==3.11, transformers, and outlines==0.14 installed.

Rolling back to 0.1.3 works as intended.

Steps/code to reproduce the bug:

Python 3.11.10 | packaged by conda-forge | (main, Oct 16 2024, 01:26:25) [Clang 17.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from outlines import models, generate
>>> model = models.mlxlm("meta-llama/Llama-3.2-1B-Instruct")
Traceback (most recent call last):
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/models/mlxlm.py", line 231, in mlxlm
    import mlx.core as mx
ModuleNotFoundError: No module named 'mlx'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/models/mlxlm.py", line 234, in mlxlm
    raise ImportError(
ImportError: The `mlx_lm` library needs to be installed in order to use `mlx_lm` models.
>>> model = models.transformers("meta-llama/Llama-3.2-1B-Instruct")
>>> class User(BaseModel):
...     name:str
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'BaseModel' is not defined
>>> from pydantic import BaseModel
>>> class User(BaseModel):
...     name:str
...
>>> generator = generate.json(model,User)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/generate/json.py", line 49, in json
    generator = regex(model, regex_str, sampler)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/generate/regex.py", line 34, in regex
    logits_processor = RegexLogitsProcessor(regex_str, tokenizer=model.tokenizer)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/processors/structured.py", line 152, in __init__
    guide = RegexGuide.from_regex(regex_string, tokenizer)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/fsm/guide.py", line 94, in from_regex
    return super().from_regex(
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines_core/fsm/guide.py", line 212, in from_regex
    ) = _create_states_mapping(
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/caching.py", line 123, in wrapper
    wrapper.__memory__.set(cache_key, result, expire, retry=True)
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/diskcache/core.py", line 772, in set
    size, mode, filename, db_value = self._disk.store(value, read, key=key)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/outlines/caching.py", line 29, in store
    value = cloudpickle.dumps(value)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1529, in dumps
    cp.dump(obj)
  File "/Users/sigjhl/miniforge3/envs/ol/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1295, in dump
    return super().dump(obj)
           ^^^^^^^^^^^^^^^^^
TypeError: cannot pickle 'builtins.Index' object
>>>

Expected result:

a generator object.

Error message:

No response

Outlines/Python version information:

Version information

``` 0.1.4 ```

Context for the issue:

No response

cpfiffer commented 6 days ago

Same issue, using

# Best import of all time
from outlines import models, generate
import torch

# Load a language model into memory
model = models.transformers(
    "microsoft/Phi-3-mini-4k-instruct",
    device="cuda",
    model_kwargs={"torch_dtype": torch.bfloat16},
)

# Choices
choices = ["Smith", "Jones"]

# Generate a response
response = generate.choice(model, choices)
print(response)

version info:

outlines==0.1.4
outlines_core==0.1.17