Closed Dampfinchen closed 1 year ago
Something is wrong with the tokenizer setup in this model @Dampfinchen try to run this:
from transformers import AutoTokenizer
AutoTokenizer.from_pretrained("Undi95/Nous-Hermes-13B-Code", trust_remote_code=True)
it produces
Traceback (most recent call last):
File "nstest.py", line 3, in <module>
AutoTokenizer.from_pretrained("Undi95/Nous-Hermes-13B-Code", trust_remote_code=True)
File "/home/user/.local/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 702, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/user/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1841, in from_pretrained
return cls._from_pretrained(
File "/home/user/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2053, in _from_pretrained
raise ValueError(
ValueError: Wrong index found for <pad1>: should be 0 but found 32001.
I've never seen anything like this before, seems to be coming from added_tokens.json which was curiously deleted in the NH upstream
Is there something specific to this merge that needs those extra tokens? They aren't present in either of the upstream models. In the interest of completing an eval, I will use the tokenizer from NousResearch/Nous-Hermes-Llama2-13b
This model performed quite well on the whole, preferring Vicuna-1p3 style prompting for Python and airboros-plain for JavaScript.
Ah yes, we basically just added these to convert the model to GGUF, which is a bit picky about these things. Thanks for evaluating!
This is an experiment to evaluate if the impressive coding capabilities of Nous Hermes can be further improved by merging it with Jon Durbin's code lora adapter.