tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.09k stars 285 forks source link

Trainning data #162

Open snowlovehang opened 1 year ago

snowlovehang commented 1 year ago

Hello sir, when I use java-small preprocess dataset to execute the code2vec model, the following problem occurs.I changed the suffix of the dataset c2s to c2v 1661616103(1) 1661616111(1)

urialon commented 1 year ago

Hi @snowlovehang , Thank you for your interest in our work!

The preprocessing formats of code2vec and code2seq are similar, but not identical, so just renaming *.c2s to *.c2v will not work. You will need to either preprocess the raw data using code2vec's preprocessing pipeline. But why don't you just use code2seq?

snowlovehang commented 1 year ago

oh,thank you。