mlx-community/Phi-3-mini-4k-instruct-4bit and mlx-community/Phi-3-mini-4k-instruct-8bit generate text without spaces when using outlines. Interestingly, mlx-community/Meta-Llama-3-8B-Instruct-8bit generates text with spaces.
When loading with mlx_lm, all models generate responses as expected.
The implication of this is that the structured generation based on a Pydantic model fails because the generated text (which has no spaces) fails Pydantic validation.
Is this an issue with how outlines handles phi3 responses, or is an issue with the phi3 models from mlx-community?
Some basic testing I did:
Using phi3 MLX models directly with mlx_lm works fine:
from mlx_lm import load, generate
model, tokenizer = load("mlx-community/Phi-3-mini-4k-instruct-8bit")
response = generate(model, tokenizer, prompt="What is a good apple?", verbose=False)
print(response)
'\n<|assistant|> A good apple can vary depending on personal preferences, but generally, a good apple is one that is:\n\n1. Firm: A good apple should have a firm texture, which indicates that it is fresh and not overripe.\n2. Colorful: A good apple should have a vibrant and consistent color, which can indicate ripeness and flavor.\n3. Taste: A good apple should have a balanced and pleasant taste, with a'
However, using with outlines returns text without spaces
from outlines import generate, models
model = models.mlxlm("mlx-community/Phi-3-mini-4k-instruct-8bit")
generator = generate.text(model)
response = generator("What is a good apple?")
print(response)
from outlines import generate, models
from enum import StrEnum
from Pydantic import BaseModel, Field
import json
import textwrap
model = models.mlxlm("mlx-community/Phi-3-mini-4k-instruct-8bit")
FRUITS = [
"Red delicious apples",
"Purple juicy grapes",
"Orange sweet tangerines",
"Yellow ripe bananas",
]
FruitEnum = StrEnum("FruitEnum", FRUITS)
class Fruit(BaseModel):
"""Return fruit description."""
fruits: List[FruitEnum] = Field(
description="A list of appetizing fruits",
min_length=1,
)
generator = generate.json(
model,
Fruit,
# whitespace_pattern=r"[\n\t ]*", # None, r"[ ]", r"[ ]*", r"[\n\t ]", r"[\n\t ]*"
) # only call once per schema, not per-generation
# heal invalid json
invalid_json = """{fruits: ["bananas", "apples", "oranges"]}"""
result = generator(
textwrap.dedent(
f"""
Fix this JSON by enforcing the following schema:
{json.dumps(Fruit.model_json_schema())}
---
'{invalid_json}'
"""
).strip()
)
print(json.dumps(json.loads(result.json()), indent=2))
Expected result:
from mlx_lm import load, generate
model, tokenizer = load("mlx-community/Phi-3-mini-4k-instruct-8bit")
response = generate(model, tokenizer, prompt="What is a good apple?", verbose=False)
print(response)
`'\n<|assistant|> A good apple can vary depending on personal preferences, but generally, a good apple is one that is:\n\n1. Firm: A good apple should have a firm texture, which indicates that it is fresh and not overripe.\n2. Colorful: A good apple should have a vibrant and consistent color, which can indicate ripeness and flavor.\n3. Taste: A good apple should have a balanced and pleasant taste, with a'`
Error message:
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
File /Users/alex.graber/_code/is-pmiaivolution-datascienceanalytics/notebooks/infinity-client/structured_generation.py:4
1 # %%
2 # heal invalid json
3 invalid_json = """{fruits: ["bananas", "apples", "oranges"]}"""
----> 4 result = generator(
5 textwrap.dedent(
6 f"""
7 Fix this JSON by enforcing the following schema:
8
9
10 {json.dumps(Fruit.model_json_schema())}
11
12
13 ---
14
15 '{invalid_json}'
16 """
17 ).strip()
18 )
19 print(json.dumps(json.loads(result.json()), indent=2))
File ~/micromamba/envs/infinity/lib/python3.11/site-packages/outlines/generate/api.py:511, in SequenceGeneratorAdapter.__call__(self, prompts, max_tokens, stop_at, seed, **model_specific_params)
499 generation_params = self.prepare_generation_parameters(
500 max_tokens, stop_at, seed
501 )
503 completions = self.model.generate(
504 prompts,
505 generation_params,
(...)
508 **model_specific_params,
509 )
--> 511 return format(completions)
File ~/micromamba/envs/infinity/lib/python3.11/site-packages/outlines/generate/api.py:497, in SequenceGeneratorAdapter.__call__.<locals>.format(sequences)
495 return [format(sequence) for sequence in sequences]
...
Input should be 'red delicious apples', 'purple juicy grapes', 'orange sweet tangerines' or 'yellow ripe bananas' [type=enum, input_value='yellowripebananas', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/enum
fruits.2
Input should be 'red delicious apples', 'purple juicy grapes', 'orange sweet tangerines' or 'yellow ripe bananas' [type=enum, input_value='purplejuicygrapes', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/enum
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
Outlines/Python version information:
Version information
```
(command output here)
```
Context for the issue:
I want to use the smallest model possible with structured generation to "heal" invalid JSON. Phi3-mini-4k generally fits the bill, except that I get validation errors due to the lack of spaces in the generated response when using the MLX model variant.
Describe the issue as clearly as possible:
mlx-community/Phi-3-mini-4k-instruct-4bit
andmlx-community/Phi-3-mini-4k-instruct-8bit
generate text without spaces when usingoutlines
. Interestingly,mlx-community/Meta-Llama-3-8B-Instruct-8bit
generates text with spaces. When loading with mlx_lm, all models generate responses as expected.The implication of this is that the structured generation based on a Pydantic model fails because the generated text (which has no spaces) fails Pydantic validation.
Is this an issue with how
outlines
handlesphi3
responses, or is an issue with the phi3 models frommlx-community
?Some basic testing I did: Using phi3 MLX models directly with
mlx_lm
works fine:However, using with
outlines
returns text without spacesSteps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
Version information
Context for the issue:
I want to use the smallest model possible with structured generation to "heal" invalid JSON. Phi3-mini-4k generally fits the bill, except that I get validation errors due to the lack of spaces in the generated response when using the MLX model variant.