salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.71k stars 396 forks source link

Prediction on new data #14

Closed surtantheta closed 2 years ago

surtantheta commented 2 years ago

Can you provide a small instance of CodeT5's prediction on concode data set for the concode task ? I am not sure how to make predictions on new data and what command line argument to use.

yuewang-cuhk commented 2 years ago

Hi, for fine-tuning CodeT5-base on the concode task, you can go to sh folder and simply type python run_exp.py --model_tag codet5_base --task concode --sub_task none.

If you want to fine-tune on your dataset, you can add your own task in configs.py (here) and add your data path and the function to read in utils.py (here and here). The read function can be implemented in _utils.py similar to this one.