Closed markdjwilliams closed 1 year ago
The same failure occurs for the Mosaic model.
However, I think I've found the problem. The highlighted line here defines std::string word;
outside of the vocab loading loop which is being updated as each word in the vocabulary is loaded. Simply moving the definition of word
to within the inner loop seems to allow correct tokenization and inference, at least on my platform/compiler.
So std::string.data()
returns a const char *
for earlier revisions of the C++ specification.
This line casts away this constness before writing to the underlying storage, so on my compiler replacing (char *)word.data()
with &word[0]
also fixed the issue.
I'm hitting an error while running RedPajama. It's likely the result of a misunderstanding on my part, so I'm hoping somebody can shed some light on what I'm doing wrong.
To begin with, I've cloned ggml from commit
74705055853f7922e9622bdd0a1ebde2b8f57431
. I build with gcc 9.4.0 on Linux x86:This completes without error. I've already cloned
https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-3B-v1
, so proceed to ggml conversion:Next, I quantize the model:
And finally attempt inference:
As you can see, errors of the form
gpt_tokenize: unknown token 'I'
appear and the output text is nonsensical. I seem to get the same problem whether I use a 32-bit, 16-bit, or 4-bit model.Does anything look amiss in the steps that I've performed or the logs which are generated from conversion/quantization? Any help at all would be appreciated!