Cannot run HF example codes on all three codeLlama-Python-hf models

I understand this might be a huggingface-related problem but I cannot find the answer anywhere so I come to ask for help.

On huggingface there is a example code for codellama model:

from transformers import LlamaForCausalLM, CodeLlamaTokenizer

tokenizer = CodeLlamaTokenizer.from_pretrained("codellama/CodeLlama-7b-hf") model = LlamaForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf") PROMPT = '''def remove_non_ascii(s: str) -> str: """ return result ''' input_ids = tokenizer(PROMPT, return_tensors="pt")["input_ids"] generated_ids = model.generate(input_ids, max_new_tokens=128)

filling = tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens = True)[0] print(PROMPT.replace("", filling))

And the output is like:

def remove_non_ascii(s: str) -> str:
    """ Remove non-ASCII characters from a string.

    Args:
        s: The string to remove non-ASCII characters from.

    Returns:
        The string with non-ASCII characters removed.
    """
    result = ""
    for c in s:
        if ord(c) < 128:
            result += c
    return result

However, this works fine with all the original codellama model and codellama instruct models. But all three codellama-Python models will show tons of "Assertion srcIndex < srcSelectDimSize failed" errors and fail to complete the running. The second strange thing is that, if I delete the ' ' part in the PROMPT when I am using codellama-Python model, then the error won't show , however there will still be no output.

So my questions are:

Why will these "Assertion srcIndex < srcSelectDimSize failed" errors happen on codellama-Python, as well as the no-output problems after I deleting the in the PROMPT? From my point of view, codeLlama-python is just modified on more Python tasks, and it should not be fundamentally different with original codellama and codellama-instruct.
Why the readme of Huggingface page says Codellama-Python cannot do infilling? Why modification on Python tasks will make the model cannot do infilling? Is the problem in my question 1 related to this lack of infilling of codellama-Python?

Thank you so much for your precious time.

meta-llama / codellama

Cannot run HF example codes on all three codeLlama-Python-hf models #166