Open poojitharamachandra opened 2 years ago
You should use an encoder model to load CodeBERT. If you want to use CodeBERT to initialize the encoder in a Seq2Seq model, you can refer to our implementation here. https://github.com/microsoft/CodeXGLUE/blob/main/Code-Text/code-to-text/code/run.py#L261
Could you please elaborate how to use the pre-trained model to generate descriptions for my code? is fine-tuning necessary?
Yes. It needs fine-tuning. CodeBERT is only an encoder model. If you want to generate descriptions, you need a Seq2Seq model that uses CodeBERT as encoder to encoding codes and uses a decoder to generate comments. The decoder needs to be trained from scratch. Following the instructions https://github.com/microsoft/CodeXGLUE/tree/main/Code-Text/code-to-text to fine-tune CodeBERT for this purpose. And if you want better performance, you could try the newest SOTA model UniXcoder.
Thanks. I was able to fine tune and make an inference. I will also check UniXcoder. In the literature you say that 15% of the tokens are masked for MLM. Where do you do it in the code? During inference, the fine tuned model predicts the whole sentence or just parts of it?
In the literature you say that 15% of the tokens are masked for MLM. Where do you do it in the code?
MLM is for model pre-training, while this repo only fine-tunes these models.
During inference, the fine tuned model predicts the whole sentence or just parts of it?
The whole sentence.
Thanks. Could you direct me to repo with the model architecture?
If you mean the pre-training code of CodeBERT, I think it isn't released. If you need the detailed model architecture, please refer to huggingface's transformers repo https://github.com/huggingface/transformers. CodeBERT shares the same architecture of RoBERTa which you can find there.
Thanks. Do you release the fine-tuned model (UniXCoder) for code-summarization task , that is fine-tuned on C projects?
hi, I downloaded the model from https://huggingface.co/microsoft/codebert-base/tree/main and using to run an inference (without fine-tuning). But unable to load the model file pytorch_model.bin as there is mismatch in the keys.
Seq2Seq model expect keys to be in format : "encoder.encoder.layer.1.attention.output.dense.bias" , but the saved model has keys in the format of 'encoder.layer.1.attention.output.dense.bias'