microsoft / KEAR

Official code for achieving human parity on CommonsenseQA with External Attention
106 stars 25 forks source link

train model #8

Open WQi777 opened 2 years ago

WQi777 commented 2 years ago

Hello, excuse me.Would you tell me how can I reproduce the results in your paper? when I train the model according to the method in the ’readme‘, the accuracy rate I get continues to drop with each round of training. Can you tell me what is the reason?Looking forward to your reply~

xycforgithub commented 2 years ago

Hi WQi777, Thanks for your interest. We just added a line to reproduce results of DeBERTa v3 to https://github.com/microsoft/KEAR/blob/main/bash/task_train.sh. Hope that helps!

WQi777 commented 2 years ago

Thanks a lot for your reply!

WQi777 commented 2 years ago

but when i run the code ,got an error:

batch size: 4, total_batch_size: 20 [1528]: world_size = 2, rank = 1, backend=nccl batch size: 4, total_batch_size: 20 restarting from checkpoint. used_name: last2 restarting from checkpoint. used_name: last2 loading result from dir test/last2 args.fp16 is 0 loading result from dir test/last2 args.fp16 is 0 load_vocab microsoft/deberta-v3-large load_vocab microsoft/deberta-v3-large Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. load_data data/csqa_ret_3datasets/train_data.json Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. load_data data/csqa_ret_3datasets/train_data.json data: 9741, world_size: 2 load_data data/csqa_ret_3datasets/dev_data.json data: 1222, world_size: 2 get dir test/ make dataloader ... data: 9741, world_size: 2 load_data data/csqa_ret_3datasets/dev_data.json data: 1222, world_size: 2 get dir test/ make dataloader ... max len: 968 95 percent len: 490 train_data 9741 total length: 1218 max len: 968 95 percent len: 490 train_data 9741 total length: 1218 max len: 851 95 percent len: 514 devlp_data 1222 init_model test/last2 set config, model_type= debertav2 deepspeed: True resume_training: True config_path:test/last2 model_type= debertav2 Traceback (most recent call last): File "task.py", line 409, in srt.init(Model) File "task.py", line 46, in init model = ModelClass(lm_config, opt=vars(self.config)) File "/home/aipf/work/wq/KEAR/model/model.py", line 51, in init self.deberta = MyDebertaV2Model(config) NameError: name 'MyDebertaV2Model' is not defined

mmexport1649593091112

Looking forward to your reply.

xycforgithub commented 2 years ago

Hi WQi777, We have a typo in our code - can you try again?