Open taiyou2000 opened 1 year ago
I think there is still something wrong with mpt. I also had the larger MPT-30b output weird characters here and there. And yesterday koboldcpp kept crashing with a segmentation fault while generating close to and beyond 2048 tokens. The prompt ingestion worked fine. Also i'm not sure how intelligent it's supposed to be. Either there is something wrong with it, or it's way less 'good' than a llama based model.
(EDIT: The segmentation fault is my own fault: I missed the parameter --contextsize
. Maybe we need to check if the user gives contradictory values on cli and koboldai lite.)
Interesting! I think I had a similar problem with TheBloke/PULI-GPT-3SX-GGML (puli-gpt-3sx.ggmlv1.q8_0.bin).
Generates garbled characters, gives unknown token '�' error, and generally can only generate nonsense, no matter of temperature or sampling preset used. I decided it's not worth the effort, so I have given up on using it.
I tried to use mpt-7b-ggml-q5_1(https://huggingface.co/TheBloke/MPT-7B-GGML) with koboldcpp(commit hash: e6ddb15c3a8) on Ubuntu 22.04. It was fine with generating English alphabet but when it comes to characters in languages other than English, it's generating garbled characters like this:
������
��都市
����都
And the terminal is showing: gpt_tokenize: unknown token '�'
I also tried to run mpt with pytorch in colab and my computer but both encountered OOM error so I can't tell if this is whether ggml or pytorch/transformers side issue. But I think this is ggml side issue. I suspected this is caused by misconfiguration of encoding in terminal. But it was UTF-8(ja_JP.UTF-8) and it is unlikely caused by terminal encoding. https://github.com/ggerganov/ggml caused same result.
It seems like similar issue was discussed in early llama.cpp repository https://github.com/ggerganov/llama.cpp/pull/73