microsoft / CodeXGLUE

CodeXGLUE
MIT License
1.51k stars 363 forks source link

Concode preprocessing tokens #91

Closed songyang-dev closed 2 years ago

songyang-dev commented 2 years ago

Hello, I'd like to know how the data is formatted in the Concode Java dataset. According to the docs, there are special tokens in the text input. What do they mean? con_elem_sep and con_func_sep

celbree commented 2 years ago

We use CONCODE's original dataset. CONCODE includes class environment as inputs, the con_xx_sep is used to seperate different elements in the class environments. Please infer CONCODE paper and their GitHub for details.