microsoft / CodeXGLUE

CodeXGLUE
MIT License
1.5k stars 363 forks source link

Is the code for the text-to-code seq2seq model available? #139

Open geajack opened 1 year ago

geajack commented 1 year ago

I'm trying to understand exactly how the baseline results were achieved on the text-to-code task were achieved. The very first set of results is for a model called "seq2seq":

Model | EM | BLEU | CodeBLEU
Seq2Seq | 3.05 | 21.31 | 26.39

However, I don't see any code in the repo that seems to be implementing a seq2seq model, or if it's there I can't tell where it is. All I can see are pre-baked model classes being loaded from the transformers library, and it's not clear to me which model corresponds to which row in the Results table in text-to-code/README.md. The original CodeXGLUE is also a little short on details as to exactly what the "seq2seq" model is.

Is there code around that I can just run and see that 3.05 EM and 21.31 BLEU pop up on my screen for myself? Maybe even in a different repo?

Thanks

celbree commented 1 year ago

Hi, the Seq2Seq model is based on LSTM and we didn't release that code in our repo. If you're interested in re-produce the results reported in our repo, you can start with fine-tuning GPT-2 or CodeGPT, since transformer is now more popular than LSTM.