abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.16k stars 970 forks source link

low level examples broken after [feat: Update sampling API for llama.cpp (#1742)] #1803

Open mite51 opened 1 month ago

mite51 commented 1 month ago

I believe that after the commit for "Update sampling API for llama.cpp (#1742)" the low level examples broke.

Traceback (most recent call last):
  File "/home/jwylie/dev/LLMExplorer/llama_cpp/examples/low_level_api/low_level_api_chat_cpp.py", line 761, in <module>
    m.interact()
  File "/home/jwylie/dev/LLMExplorer/llama_cpp/examples/low_level_api/low_level_api_chat_cpp.py", line 697, in interact
    for i in self.output():
             ^^^^^^^^^^^^^
  File "/home/jwylie/dev/LLMExplorer/llama_cpp/examples/low_level_api/low_level_api_chat_cpp.py", line 664, in output
    cur_char = self.token_to_str(id)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jwylie/dev/LLMExplorer/llama_cpp/examples/low_level_api/low_level_api_chat_cpp.py", line 638, in token_to_str
    n = llama_cpp.llama_token_to_piece(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: this function takes at least 6 arguments (4 given)

The class LlamaSamplingContext is still there, but no longer used, in fact it cannot be used because sample_repetition_penalties was removed. Also, a request, an example of how to get sorted candidate data after a sample would be appreciated, I had a fork using LlamaSamplingContext to store and return candidate data, but that no longer seems to be possible, or at least its no longer clear to me how I might do something similar.