salesforce / CodeT5

Home of CodeT5: Open Code LLMs for Code Understanding and Generation
https://arxiv.org/abs/2305.07922
BSD 3-Clause "New" or "Revised" License
2.74k stars 401 forks source link

How to do token type prediction task? #85

Open oathaha opened 1 year ago

oathaha commented 1 year ago

In the paper, there is a task that lets a model to predict whether a code token is identifier or not. Can you explain more detail about how to do it? I would like to know the data format and how to use the data to train a model.