Open jose opened 1 year ago
Hi @jose,
Thanks @NougatCA,
We did some exploratory experiments using gpt2 and distilgpt2 and found that the performance of both was similar, so we used the smaller size latter for efficiency reasons.
Which one is the smallest one, distilgpt2 or gpt2?
As for SynCoBERT, I am very sorry if it is not included in the zip file I provided, it is an oversight on our part. Unfortunately, since I'm visiting abroad right now and I saved the model on my desktop in China, I can't get the model at the moment. I will update SynCoBERT as soon as it is available and upload the model to HuggingFace.
Any chance you could provide a date of when would the model/code be available? Thanks in advance.
Thanks, I've just updated the table.
Hi @NougatCA,
I'm trying to understand from where did you get the models/tokenizers and here is the breakdown of all models evaluated in the empirical study and listed in the pre-print.
(Note: SCELMo [52], OSCAR [60], and CodeDisen [61] have been excluded for several reasons. See the pre-print for more details.)
Questions/Comments regarding the table above:
Regarding the GPT-2 [9] model, why did you use the distilgpt2 model instead of the gpt2?
According to the pre-print, there was no pre-trained model neither source code of GPT-C [14], C-BERT [13], and DeepDebug [16]. Thus, you re-implemented and pre-trained all of them according to the settings (e.g., tokenizer, hyperparameters, and dataset) described in the original papers. Those are kindly provided by you in here, thanks for that.
According to the pre-print, there was no pre-trained model neither source code of SynCoBERT [63] and therefore you re-implemented and pre-trained it as described in the original paper. Did you by any chance forgot to include SynCoBERT in the zip file you kindly provided here?
-- Best, Jose