Open ba0gu0 opened 4 weeks ago
confirming the same issue llama_get_logits_ith: invalid logits id -1, reason: no logits
when using https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B-GGUF, setting embedding=False
works (my default configuration uses True)
Python version: 3.9.16 llama-cpp-python version: 0.3.1 Model: Hermes-3-Llama-3.1-8B (GGUF format)
llama_get_logits_ith: invalid logits id -1 error when embedding=True
Expected Behavior
When using llama-cpp-python with Qwen2 model, the chat completion should work normally regardless of whether the embedding parameter is enabled or not.
Current Behavior
The model works fine when
embedding=False
, but throws an errorllama_get_logits_ith: invalid logits id -1, reason: no logits
whenembedding=True
.Working Code Example
Error Reproduction
Environment Info
Steps to Reproduce
embedding=True
Additional Context
The error only occurs when:
embedding
parameter is set toTrue
The model works fine for chat completion when
embedding=False
, suggesting this might be related to how the embedding functionality is implemented for this specific model.