Closed aw632 closed 1 month ago
The FSM produced by interegular
cannot produce any complete strings.
This is likely caused by interegulars incomplete negative lookaround implementation
>>> import interegular
>>> pattern = r"(\[TOOL_CALLS\] \{\}|(?!\[TOOL_CALLS\]).*)"
>>> fsm = interegular.parse_pattern(pattern).to_fsm()
>>> ["".join(s) for s in fsm.strings(100)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
File "/home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py", line 684, in strings
raise ValueError(f"Couldn't find an example within {max_iterations} iterations")
ValueError: Couldn't find an example within 100 iterations
It is valid with re
, but not interegular
>>> import re
>>> re.match(pattern, "[TOOL_CALLS] {}")
<re.Match object; span=(0, 15), match='[TOOL_CALLS] {}'>
>>> fsm.accepts("[TOOL_CALLS] {}")
False
>>> re.match(pattern, '[TOOL_CALLS] {}, {"name": "toolname"}')
<re.Match object; span=(0, 15), match='[TOOL_CALLS] {}'>
>>> fsm.accepts('[TOOL_CALLS] {}, {"name": "toolname"}')
False
You might consider a simpler pattern. Please let me know if you have any other questions.
Describe the issue as clearly as possible:
Regex generation with certain regex strings will produce strings that don't match the regex.
I have this regex string:
Since interregular has implicit anchoring, I use this instead:
However, with this, I am getting outputs that don't match the original regex (with anchoring). Instead, the outputs are consistent with the regex without anchoring. See this test online, and try to remove/add the ^ and $: https://regex101.com/r/EZIPmV/2. For instance, I'm getting outputs like
The interregular maintainer confirmed that the anchoring is implicit in that dependency, so I've narrowed it down to just outlines itself having this issue.
Note:
generate
will strip out the special tokens like[TOOL_CALLS]
, but you can see them if you modifygenerate
or if you use a different inference engine like vLLM (I was able to produce the same issues there as well).Steps/code to reproduce the bug:
Expected result:
Error message:
No response
Outlines/Python version information:
Version information
Context for the issue:
No response