salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.65k stars 391 forks source link

Failed to do inference after done the Instruction Tuning to Align with Natural Language Instructions #131

Open JustinZou1 opened 11 months ago

JustinZou1 commented 11 months ago

I have completed to do the instracution tuning with code_alpaca_20k.json.

deepspeed instruct_tune_codet5p.py \
  --load /home/ubuntu/ChatGPT/Models/Salesforce/codet5p-6b --save-dir output/instruct_codet5p_6b --instruct-data-path /home/ubuntu/ChatGPT/Data/alpaca-data/CodeAlpaca-20k/code_alpaca_20k.json \
  --fp16 --epochs 5 --deepspeed deepspeed_config.json 

And the final model is in the folder of "/home/ubuntu/ChatGPT/CodeGen/CodeT5/CodeT5+/output/instruct_codet5p_6b/final_checkpoint", I tried to do inference, there have following issue:

1689729160126

And this is the inference code: (codegen) ubuntu@chatbot-a10:~/ChatGPT/CodeGen/CodeT5/CodeT5+$ cat cli1.py

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch

checkpoint = "/home/ubuntu/ChatGPT/Models/Salesforce/codet5p-6b"
checkpoint = "/home/ubuntu/ChatGPT/CodeGen/CodeT5/CodeT5+/output/instruct_codet5p_6b/final_checkpoint"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(
        checkpoint,
        trust_remote_code=True
)

model = AutoModelForSeq2SeqLM.from_pretrained(
        checkpoint,
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    trust_remote_code=True).to(device)

inputs = tokenizer(
        "def print_hello_world():",
        return_tensors="pt").to(device)

inputs['decoder_input_ids'] = inputs['input_ids'].clone()

outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
yuewang-cuhk commented 11 months ago

Hi, this might be due to that the modeling class is stored in a remote Hugging Face repo (see here). You can download the modeling_codet5p.py and configuration_codet5p.py to your local environment and then import the model class from it to load your finetuned checkpoint (without trust_remote_code=True).