antimatter15 / alpaca.cpp

Locally run an Instruction-Tuned Chat-Style LLM
MIT License
10.25k stars 911 forks source link

"failed to tokenize string!" #159

Open gdin2015cs21 opened 1 year ago

gdin2015cs21 commented 1 year ago

I have encountered a problem: failed to tokenize string! And I don't know what his answer means. image

mvsjober commented 1 year ago

I had the same issue, this change in the code seems to fix the problem for me. Haven't yet had time to understand what is happening, let alone make a pull request:

diff --git a/chat.cpp b/chat.cpp
index 22f0a4d..bc2c770 100644
--- a/chat.cpp
+++ b/chat.cpp
@@ -152,11 +152,11 @@ bool llama_model_load(const std::string & fname, llama_model & model, gpt_vocab
             return false;
         }

-        std::string word;
         for (int i = 0; i < n_vocab; i++) {
             uint32_t len;
             fin.read((char *) &len, sizeof(len));

+            std::string word;
             word.resize(len);
             fin.read((char *) word.data(), len);