salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.68k stars 394 forks source link

Clone detection for python code snippets? #63

Closed TejaswiniiB closed 1 year ago

TejaswiniiB commented 2 years ago

Hi, Can we do clone detection on python codes also using clone task? Description in the README mentioned clone detection on Java Data specifically. But after looking at the code in run_clone.py etc, looks like same clone task can be used for clone detection of python codes also, since the internal models being used are same for any task. Am I correct?

yuewang-cuhk commented 1 year ago

Hi, yes, you can finetune CodeT5 models on clone detection with Python code, as their pretraining dataset includes Python language.