abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.71k stars 928 forks source link

Emojis in grammar causes crash #1199

Open runarheggset opened 6 months ago

runarheggset commented 6 months ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

When using emojis in grammar, I expect the grammar to work the same way as in llama.cpp . Here's an example of grammar that works in llama.cpp, but not llama-cpp-python:

root ::= emoji+
emoji ::= [😀-🙏]

Current Behavior

llama-cpp-python is unable to parse the emojis and instead crashes.

Environment and Context

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  24
  On-line CPU(s) list:   0-23
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 9 5900X 12-Core Processor
    CPU family:          25
    Model:               33
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            7400.07
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx m
                         mxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmul
                         qdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_leg
                         acy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2
                          smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr a
                         rat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vae
                         s vpclmulqdq rdpid fsrm
Virtualization features:
  Virtualization:        AMD-V
  Hypervisor vendor:     Microsoft
  Virtualization type:   full
Caches (sum of all):
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    6 MiB (12 instances)
  L3:                    32 MiB (1 instance)
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Mitigation; safe RET, no microcode
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
Linux DESKTOP-P7EJA62 5.15.133.1-microsoft-standard-WSL2 #1 SMP Thu Oct 5 21:02:42 UTC 2023 x86_64 GNU/Linux
Python 3.11.2
GNU Make 4.3
g++ (Debian 12.2.0-14) 12.2.0

Failure Information (for bugs)

This issue seems to be the same as https://github.com/ggerganov/llama.cpp/issues/2501

Steps to Reproduce

Here's a short Python snippet that reproduces the issue:

from llama_cpp import LlamaGrammar

grammar = '''
root ::= emoji+
emoji ::= [😀-🙏]
'''

LlamaGrammar.from_string(grammar)

Failure Logs

Traceback (most recent call last):
  File "/home/username/documents/heggiz/worker-llama-cpp/repro.py", line 8, in <module>
    LlamaGrammar.from_string(grammar)
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 68, in from_string
    parsed_grammar = parse(const_char_p(grammar))  # type: parse_state
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 1013, in parse
    pos = parse_rule(state, pos)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 984, in parse_rule
    pos = parse_alternates(state, pos, name, rule_id, False)  # type: const_char_p
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 939, in parse_alternates
    pos = parse_sequence(state, src, rule_name, rule, is_nested)  # type: const_char_p
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 774, in parse_sequence
    char_pair = parse_char(pos)  # type: Tuple[int, const_char_p]
                ^^^^^^^^^^^^^^^
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 664, in parse_char
    return decode_utf8(src)
           ^^^^^^^^^^^^^^^^
  File "/home/username/documents/heggiz/worker-llama-cpp/venv/lib/python3.11/site-packages/llama_cpp/llama_grammar.py", line 561, in decode_utf8
    len = lookup[highbits]  # type: int
          ~~~~~~^^^^^^^^^^
IndexError: tuple index out of range
DavidGOrtega commented 5 months ago

@heggiz in the meantime this works for me

with open('grammars/' + grammar_file, 'r') as file:
    grammar_text = file.read()

grammar = LlamaGrammar.from_string(grammar_text)

emojis.gbnf

root    ::= emoji+
emoji   ::= [\U0001F600 - \U0001F64F]