Can the CodeT5 model do code autocompletion without fine-tuning?

salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation

https://arxiv.org/abs/2305.07922

BSD 3-Clause "New" or "Revised" License

2.71k stars 396 forks source link

Can the CodeT5 model do code autocompletion without fine-tuning? #21

Closed frankxu2004 closed 2 years ago

frankxu2004 commented 2 years ago

The readme mentioned that this is used for Code autocompletion in VSCode, I wonder how to use CodeT5 without fine-tuning as a language model to complete code given some previous context in code?

yuewang-cuhk commented 2 years ago

Yes, it can support this for completing a code span. Generally, CodeT5 without fine-tuning would be more suitable for code span autocompletion with previous and after contexts, which is more aligned with span denoising objective in pre-training.

frankxu2004 commented 2 years ago

Thanks for the quick response. I wonder how long (say the number of tokens) the code span completion usually is?

yuewang-cuhk commented 2 years ago

We have included such details in the paper: the average of span length is 3 tokens (before subword tokenization).