Open balisujohn opened 10 months ago
(intuition tells me it's because this is a LoRA finetune)
Oh weird, for some reason they added 2 additional word tokens. 2 5120 2 * 4 bytes
I'll take them out for now, and think about a way to handle it better.
I got https://huggingface.co/TheBloke/Llama-2-13B-GPTQ to work, but using exactly the same strategy for https://huggingface.co/TheBloke/OpenOrca-Platypus2-13B-GPTQ, I get the following error:
The offset seems to always be 81920, which is 40*2048 which are both in the
constants.rs
file for the 13b models, so maybe that's relevant.