dllama-api invokes "what(): Invalid tokenizer file "

unclemusclez commented 1 month ago

terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid tokenizer file
Aborted

dllama chat works fine:

ubuntu@ubuntu:~/distributed-llama$ sudo nice -n -20 ./dllama chat --model models/TinyLlama-1.1B-Chat-v1.0/dllama_model_TinyLlama-1.1B-Chat-v1.0_q40.m   --tokenizer models/TinyLlama-1.1B-Chat-v1.0//dllama_tokenizer_TinyLlama-1.1B-Chat-v1.0.t  --weights-float-type q40 --buffer-float-type q80 --nthreads 4  --workers 192.168.2.212:9998 192.168.2.213:9998 192.168.2.214:9998
💡 arch: llama
💡 hiddenAct: silu
💡 dim: 2048
💡 hiddenDim: 5632
💡 nLayers: 22
💡 nHeads: 32
💡 nKvHeads: 4
💡 vocabSize: 32000
💡 seqLen: 2048
💡 nSlices: 4
💡 ropeTheta: 10000.0
📄 bosId: 1
📄 eosId: 2
📄 chatEosId: 2
🕒 ropeCache: 4096 kB
⏩ Loaded 824584 kB
⭐ chat template: zephyr
🛑 stop: </s>
💻 System prompt (optional):

b4rtaz commented 1 month ago

Have you rebuild 2 applications?

make dllama
make dllama-api

unclemusclez commented 1 month ago

... i remade dllama not dllama-api 🥸

b4rtaz / distributed-llama

dllama-api invokes "what(): Invalid tokenizer file " #86