Closed unclemusclez closed 1 month ago
terminate called after throwing an instance of 'std::runtime_error' what(): Invalid tokenizer file Aborted
dllama chat works fine:
dllama chat
ubuntu@ubuntu:~/distributed-llama$ sudo nice -n -20 ./dllama chat --model models/TinyLlama-1.1B-Chat-v1.0/dllama_model_TinyLlama-1.1B-Chat-v1.0_q40.m --tokenizer models/TinyLlama-1.1B-Chat-v1.0//dllama_tokenizer_TinyLlama-1.1B-Chat-v1.0.t --weights-float-type q40 --buffer-float-type q80 --nthreads 4 --workers 192.168.2.212:9998 192.168.2.213:9998 192.168.2.214:9998 💡 arch: llama 💡 hiddenAct: silu 💡 dim: 2048 💡 hiddenDim: 5632 💡 nLayers: 22 💡 nHeads: 32 💡 nKvHeads: 4 💡 vocabSize: 32000 💡 seqLen: 2048 💡 nSlices: 4 💡 ropeTheta: 10000.0 📄 bosId: 1 📄 eosId: 2 📄 chatEosId: 2 🕒 ropeCache: 4096 kB ⏩ Loaded 824584 kB ⭐ chat template: zephyr 🛑 stop: </s> 💻 System prompt (optional):
Have you rebuild 2 applications?
make dllama make dllama-api
... i remade dllama not dllama-api 🥸
dllama chat
works fine: