Closed border-b closed 2 months ago
Can you upgrade outlines for 0.0.43 and try again?
@rlouf Upgrading to 0.0.43 solves this error. But it generates another one:
Traceback (most recent call last)
File /usr/local/lib/python3.10/site-packages/lark/lexer.py:673, in ContextualLexer.lex(self, lexer_state, parser_state)
672 last_token = lexer_state.last_token # Save last_token. Calling root_lexer.next_token will change this to the wrong token
--> 673 token = self.root_lexer.next_token(lexer_state, parser_state)
674 raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)
File /usr/local/lib/python3.10/site-packages/lark/lexer.py:598, in BasicLexer.next_token(self, lex_state, parser_state)
597 allowed = {"<END-OF-FILE>"}
--> 598 raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
599 allowed=allowed, token_history=lex_state.last_token and [lex_state.last_token],
600 state=parser_state, terminals_by_name=self.terminals_by_name)
602 value, type_ = res
UnexpectedCharacters: No terminal matches 'e' in the current parser context, at line 1 col 193
98*0.5000000000000022*2.2204460492503131e-
^
Expected one of:
* RPAR
* STAR
* NUMBER
* SLASH
* PLUS
* MINUS
* LPAR
Previous tokens: Token('NUMBER', '2.2204460492503131')
During handling of the above exception, another exception occurred:
UnexpectedCharacters Traceback (most recent call last)
Cell In[2], line 19
17 model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
18 generator = outlines.generate.cfg(model, arithmetic_grammar)
---> 19 sequence = generator("Alice had 4 apples and Bob ate 2. Write an expression for Alice's apples:")
21 print(sequence)
22 # (8-2)
File /usr/local/lib/python3.10/site-packages/outlines/generate/api.py:207, in SequenceGenerator.__call__(self, prompts, max_tokens, stop_at, rng)
205 while True:
206 try:
--> 207 last_state = next(states)
208 if max_tokens or stop_sequences:
209 token_ids = last_state.token_ids
File /usr/local/lib/python3.10/site-packages/outlines/generate/generator.py:80, in sequence_generator(model, sampler, fsms, token_ids, sequence_weights, attention_masks, fsm_states, rng)
75 except IndexError: # Exceeding the context length
76 raise ContextLengthExceededError(
77 "The input length exceeds the context length of the model."
78 )
---> 80 allowed_tokens = get_allowed_tokens(fsms, fsm_states)
81 biased_logits = bias_logits(logits, allowed_tokens)
82 next_token_ids, ancestors, sequence_weights = sampler(
83 biased_logits, sequence_weights, rng
84 )
File /usr/local/lib/python3.10/site-packages/outlines/generate/generator.py:155, in get_allowed_tokens(fsms, fsm_states)
138 def get_allowed_tokens(
139 fsms: List["Guide"], fsm_states: List[int]
140 ) -> List[Optional[Iterable[int]]]:
141 """Get the new instructions for each sequence from the finite-state machine.
142
143 Parameters
(...)
153
154 """
--> 155 return [
156 fsm.get_next_instruction(state).tokens for fsm, state in zip(fsms, fsm_states)
157 ]
File /usr/local/lib/python3.10/site-packages/outlines/generate/generator.py:156, in <listcomp>(.0)
138 def get_allowed_tokens(
139 fsms: List["Guide"], fsm_states: List[int]
140 ) -> List[Optional[Iterable[int]]]:
141 """Get the new instructions for each sequence from the finite-state machine.
142
143 Parameters
(...)
153
154 """
155 return [
--> 156 fsm.get_next_instruction(state).tokens for fsm, state in zip(fsms, fsm_states)
157 ]
File /usr/local/lib/python3.10/site-packages/outlines/fsm/guide.py:349, in CFGGuide.get_next_instruction(self, state)
346 self.regex_fsm_last = proposer
348 interactive = self.parser.parse_interactive(self.generation)
--> 349 interactive.exhaust_lexer()
351 options = {self.terminal_regexps[x] for x in interactive.accepts()}
352 # add %ignore terminals
File /usr/local/lib/python3.10/site-packages/lark/parsers/lalr_interactive_parser.py:52, in InteractiveParser.exhaust_lexer(self)
47 def exhaust_lexer(self) -> List[Token]:
48 """Try to feed the rest of the lexer state into the interactive parser.
49
50 Note that this modifies the instance in place and does not feed an '$END' Token
51 """
---> 52 return list(self.iter_parse())
File /usr/local/lib/python3.10/site-packages/lark/parsers/lalr_interactive_parser.py:43, in InteractiveParser.iter_parse(self)
35 def iter_parse(self) -> Iterator[Token]:
36 """Step through the different stages of the parse, by reading tokens from the lexer
37 and feeding them to the parser, one per iteration.
38
(...)
41 When the parse is over, the resulting tree can be found in ``InteractiveParser.result``.
42 """
---> 43 for token in self.lexer_thread.lex(self.parser_state):
44 yield token
45 self.result = self.feed_token(token)
File /usr/local/lib/python3.10/site-packages/lark/lexer.py:676, in ContextualLexer.lex(self, lexer_state, parser_state)
674 raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)
675 except UnexpectedCharacters:
--> 676 raise e
File /usr/local/lib/python3.10/site-packages/lark/lexer.py:665, in ContextualLexer.lex(self, lexer_state, parser_state)
663 while True:
664 lexer = self.lexers[parser_state.position]
--> 665 yield lexer.next_token(lexer_state, parser_state)
666 except EOFError:
667 pass
File /usr/local/lib/python3.10/site-packages/lark/lexer.py:598, in BasicLexer.next_token(self, lex_state, parser_state)
596 if not allowed:
597 allowed = {"<END-OF-FILE>"}
--> 598 raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
599 allowed=allowed, token_history=lex_state.last_token and [lex_state.last_token],
600 state=parser_state, terminals_by_name=self.terminals_by_name)
602 value, type_ = res
604 ignored = type_ in self.ignore_types
UnexpectedCharacters: No terminal matches 'e' in the current parser context, at line 1 col 193
98*0.5000000000000022*2.2204460492503131e-
^
Expected one of:
* RPAR
* STAR
* SLASH
* PLUS
* MINUS
Previous tokens: Token('NUMBER', '2.2204460492503131')
It seems the function is executing now, but the output is not following the grammar?
This seems to be the same as the issue I ran into in https://github.com/outlines-dev/outlines/issues/796
I'll be working on getting CFG and the parser in a good state over the coming weeks. You can track progress by subscribing to this issue: https://github.com/outlines-dev/outlines/issues/684
Describe the issue as clearly as possible:
I was trying to run the example of using cfg to guide generation. But it seems there is some issue with
CFGGuide
. Just running the example without any change produces an error.Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
Version information
Context for the issue:
I was trying to test the output with a custom grammar, but the provided example fails to generate any output.