When selecting llama 3.1 70B model in tinychat (with tinygrad inference engine) it maps to
NousResearch/Meta-Llama-3.1-70B
which seems to be non-chat model. Evidence for this is the following error that comes up when trying to use this model
No chat template is set for this tokenizer, falling back to a default class-level template. This is very error-prone, because models are often trained with templates different from the class default! Default chat templates are a legacy feature and will be removed in Transformers v4.43, at which point any code depending on them will stop working. We recommend setting a valid chat template before then to ensure that this model continues working without issues.
In addition it is seen that the model is clearly confused about special tokens, above is not just a cosmetic message.
It seems the correct model to use is
NousResearch/Meta-Llama-3.1-70B-Instruct
Verified it and it produces better conversation responses. To be clear - can't say if it is the best llama 3.1 70B out there, just that it doesn't have chat-template / tokenization issues that default 70B model has.
When selecting llama 3.1 70B model in tinychat (with tinygrad inference engine) it maps to
NousResearch/Meta-Llama-3.1-70B
which seems to be non-chat model. Evidence for this is the following error that comes up when trying to use this model
In addition it is seen that the model is clearly confused about special tokens, above is not just a cosmetic message.
It seems the correct model to use is
NousResearch/Meta-Llama-3.1-70B-Instruct
Verified it and it produces better conversation responses. To be clear - can't say if it is the best llama 3.1 70B out there, just that it doesn't have chat-template / tokenization issues that default 70B model has.