I download a pre-train model myself

microsoft / CodeBERT

CodeBERT

MIT License

2.25k stars 455 forks source link

I download a pre-train model myself #83

Closed konL closed 2 years ago

konL commented 3 years ago

Hi! I am working on a task of searching whether 2 code fragment is semantically similar( maybe similar with codeSearch task ) , I have done some simple attempts . I download first and transform https://huggingface.co/microsoft/codebert-base/tree/main "pytorch_model.bin " into ckpt form, then load with bert4keras as mentioned in #24

config_path = 'pretrain_model/bert_config.json'
checkpoint_path = 'pretrain_model/bert_model.ckpt'
bert_model =  build_transformer_model (config_path, checkpoint_path,
                                                    seq_len=None)

But I found that the result is not better than the result of using the original bert model, which is very weired. Am i configure it correct? If not, could you show me an example?

guoday commented 3 years ago

You need to fin-tune the model first: https://github.com/microsoft/CodeBERT/tree/master/GraphCodeBERT/codesearch#model-and-demo

konL commented 3 years ago

You need to fin-tune the model first: https://github.com/microsoft/CodeBERT/tree/master/GraphCodeBERT/codesearch#model-and-demo

Thank you for your replying! I have run the command in https://github.com/microsoft/CodeBERT/tree/master/GraphCodeBERT/codesearch#fine-tune some errors occurred: run.py: error: unrecognized arguments: --lang=java --data_flow_length 64 so I delete these2 lines. But I found that it stuck like below, could you let me know how to solve it? Thank you! #################################### ......- INFO - main - device: cuda, n_gpu: 1 ####################################

guoday commented 3 years ago

Please follow readme in model-and-demo.

konL commented 2 years ago

Hi！Sorry for bothering again! I have completed fine-tune the model Then again I load with bert4keras as mentioned in #24 (I transform model.bin into xx.ckpt first)

config_path = 'pretrain_model/bert_config.json'
checkpoint_path = 'pretrain_model/bert_model.ckpt'
bert_model =  build_transformer_model (config_path, checkpoint_path,
                                                    seq_len=None)

then I load bert and concatenate it with other layers like

bert = build_transformer_model(config_path=config_path,checkpoint_path=checkpoint_path, ......)
cls=....(bert.model.output)
dense=keras.layers.Dense(...)(cls)
...

I intend to train this model on my dataset, but the result is still not better than the result of using the original bert model. (1) Am I configuring codeBERT correctly in this way? (2) Or is it some reason for explaining my results?

guoday commented 2 years ago

Sorry. Since I never use keras, I am not sure what goes wrong. Maybe you need to check whether the transformation from pytorch model to keras model is correct.