salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.71k stars 396 forks source link

Recreating the performance from the README's gif? #16

Closed kiwih closed 2 years ago

kiwih commented 2 years ago

Hi there, I am trying to recreate the suggestion from the gif in the README. Using the suggested code in the README, I have the following:

from transformers import RobertaTokenizer, T5ForConditionalGeneration
import os

tokenizer = RobertaTokenizer.from_pretrained("Salesforce/codet5-base")
model = T5ForConditionalGeneration.from_pretrained('Salesforce/codet5-base')

text = """
// convert from one currency to another
"""

input_ids = tokenizer(text, return_tensors="pt").input_ids

# simply generate one code span
generated_ids = model.generate(input_ids, max_length=256)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

However this does not generate the suggested code, it only gets as far as "public static". What am I doing wrong?

yuewang-cuhk commented 2 years ago

Hi, the model behind the AI coding assistant GIF demo is the CodeT5-base model fine-tuned on Apex code corpurs. Generally if you want to apply CodeT5 for generating code or summaries, you need to fine-tune the model on your specific downstream tasks instead of directly using the pretrained models. For code summarization, we have released Salesforce/codet5-base-multi-sum which you can directly employ for generating summaries for functions in 6 PLs (Ruby/JavaScript/Go/Python/Java/PHP).

kiwih commented 2 years ago

That's disappointing. I would recommend removing the gif or replacing it with one that is more representative of the models actually present in this repository (i.e. have it show the performance of one of the snap shots). Alternatively, can you share the model that is fine-tuned over Apex?