Closed jukofyork closed 4 months ago
We fine-tune CodeLLaMA-70B-base with the same 4k/10k settings as CodeLLaMA-70B-Instruct.
We used the prompt template described in the model card and the dtype is fp16, but we used the config file of CodeLLaMA-70B-Instruct. Sorry for the ambiguity. We have fixed this issue on huggingface.
Closing this as it all seems fixed on HF now.
Has there been a mix up with the files uploaded to HF and some of the
CodeLlama-70b-Instruct-hf
files have been used instead ofCodeLlama-70b-hf
:https://huggingface.co/openbmb/Eurus-70b-nca/discussions/3
None of the float types seem to match, the chat template is the strange one used by
CodeLlama-70b-Instruct
(including the<step>
token), and the ROPE frequency and context length both matchCodeLlama-70b-Instruct
rather thanCodeLlama-70b
.But the appendix of the paper quite clearly implies
Eurus-70b
was trained on top of the baseCodeLlama-70b
:I'm quite interested to try the model as it's the first fine-tuned model of
CodeLlama-70b
other thanPhind-70b
(which is private/proprietary). I'm downloading the Safetensors data now and going to try copying the rest of the files from the originalCodeLlama-70b-hf
and then edit in the suggested[INST] <prompt> [\INST]
chat template to see if it works.There is already once person who has quantized a GGUF from this possibly corrupt upload so if there has been a mix up then it would be best tell them to take it down! :)