ggerganov / llama.cpp

LLM inference in C/C++
MIT License
68.62k stars 9.86k forks source link

Bug: Unable to load grammar from `json.gbnf` example #7991

Closed vecorro closed 2 months ago

vecorro commented 5 months ago

What happened?

I have tried to load the json.gbnf grammar example but haven't been able to do so. The following code is not working.

from llama_cpp.llama import Llama, LlamaGrammar
import httpx
grammar_text = httpx.get("https://raw.githubusercontent.com/ggerganov/llama.cpp/master/grammars/json.gbnf").text
grammar = LlamaGrammar.from_string(grammar_text)

This throws the following error:

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty

I'm not sure if the problem resides in the grammar definition file or in the LlamaGrammar class. The problem shows up when I use the .from_file method as well.

Name and Version

Ubuntu 22.04 Python 3.11 (Anaconda) llama_cpp_python 0.2.78

What operating system are you seeing the problem on?

Linux

Relevant log output

parse: error parsing grammar: expecting ')' at {4}) # escapes
  )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]{0,15})) ("." [0-9]+)? ([eE] [-+]? [0-9] [1-9]{0,15})? ws

# Optional space: by convention, applied in this grammar after literal chars when allowed
ws ::= | " " | "\n" [ \t]{0,20}

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[14], line 4
      2 import httpx
      3 grammar_text = httpx.get("https://raw.githubusercontent.com/ggerganov/llama.cpp/master/grammars/json.gbnf").text
----> 4 grammar = LlamaGrammar.from_string(grammar_text)

File ~/miniconda3/envs/llama-cpp/lib/python3.11/site-packages/llama_cpp/llama_grammar.py:71, in LlamaGrammar.from_string(cls, grammar, verbose)
     69 parsed_grammar = parse(const_char_p(grammar))  # type: parse_state
     70 if parsed_grammar.rules.empty():
---> 71     raise ValueError(
     72         f"{cls.from_string.__name__}: error parsing grammar file: parsed_grammar.rules is empty"
     73     )
     74 if verbose:
     75     print(f"{cls.from_string.__name__} grammar:", file=sys.stderr)

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty
matteoserva commented 5 months ago

(I am not a developer) It looks like a problem in a downstream project. I suggest opening a issue there: https://github.com/abetlen/llama-cpp-python

jabberjabberjabber commented 5 months ago

Look at the older versions of the json.gnbf file, find the one that ggerganov made, and use that.

C0deMunk33 commented 5 months ago

Just ran into this same problem, the older file works, the compiler doesn't seem to like the {4} part of it. I also reverted to the latest before this change

TheMrCodes commented 5 months ago

Same problem here *, ?, + are working for repetition but not with curly brackets like {0,5}, {4}, {1,16} or so

TheMrCodes commented 5 months ago

Ok after some quick debugging it seems like to be a problem with the llama-cpp-python library. They translated the parsing logic into python code and this code doesn't support repetition with curly brackets Reference: https://github.com/abetlen/llama-cpp-python/blob/01bddd669ca1208f1844ce8d0ba9872532641c9d/llama_cpp/llama_grammar.py#L837

TheMrCodes commented 5 months ago

Also tested my grammar file with the llama.cpp CLI file and I works like expected

TheMrCodes commented 5 months ago

Library Issue Reference: https://github.com/abetlen/llama-cpp-python/issues/1547

HanClinto commented 5 months ago

Just ran into this same problem, the older file works, the compiler doesn't seem to like the {4} part of it. I also reverted to the latest before this change

Support for discrete repetition operators was only added about 3 weeks ago in #6640 -- so I'm curious to know where exactly the mismatch is at

taellinglin commented 4 months ago

Has this issue been solved? I'd really like to pass my grammar file as an argument to the api request, is there a specific way to format it?

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.