salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.68k stars 394 forks source link

Code for normalizing variables #58

Closed pranavsb closed 1 year ago

pranavsb commented 2 years ago

Hi,

I am trying to reproduce refine task on my data. I see that dataset for refine has abstracted the types and variables eg

private void METHOD_1 ( java.lang.Class VAR_1 )...

Is the code to do this provided in utils.py? If not, how to go about this?

pranavsb commented 2 years ago

Looks like the tool used is src2abs.

Was Tufano dataset used directly or any changes were performed eg. changing String to java.lang.String?

yuewang-cuhk commented 1 year ago

Hi, sorry for the slow reply. For this dataset, we directly employ the version from the CodeBERT repo here.