the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders
https://huggingface.co/spaces/mike-ravkine/can-ai-code-results
MIT License
537 stars 30 forks source link

Evaluate Undi95/Nous-Hermes-13B-Code #87

Closed Dampfinchen closed 1 year ago

Dampfinchen commented 1 year ago

This is an experiment to evaluate if the impressive coding capabilities of Nous Hermes can be further improved by merging it with Jon Durbin's code lora adapter.

the-crypt-keeper commented 1 year ago

Something is wrong with the tokenizer setup in this model @Dampfinchen try to run this:

from transformers import AutoTokenizer
AutoTokenizer.from_pretrained("Undi95/Nous-Hermes-13B-Code", trust_remote_code=True)

it produces

Traceback (most recent call last):
  File "nstest.py", line 3, in <module>
    AutoTokenizer.from_pretrained("Undi95/Nous-Hermes-13B-Code", trust_remote_code=True)
  File "/home/user/.local/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 702, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1841, in from_pretrained
    return cls._from_pretrained(
  File "/home/user/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2053, in _from_pretrained
    raise ValueError(
ValueError: Wrong index found for <pad1>: should be 0 but found 32001.

I've never seen anything like this before, seems to be coming from added_tokens.json which was curiously deleted in the NH upstream

Is there something specific to this merge that needs those extra tokens? They aren't present in either of the upstream models. In the interest of completing an eval, I will use the tokenizer from NousResearch/Nous-Hermes-Llama2-13b

the-crypt-keeper commented 1 year ago

This model performed quite well on the whole, preferring Vicuna-1p3 style prompting for Python and airboros-plain for JavaScript.

Dampfinchen commented 1 year ago

Ah yes, we basically just added these to convert the model to GGUF, which is a bit picky about these things. Thanks for evaluating!