Hi!
I am trying to fine tune the defect detection task. Currently, the default set up is using roberta model type with microsoft/codebert as the model path. However, if I want to change it to a different model type which is under your model classes, I get few errors.
First, when I use the distilbert model type, it causes an dimension error in the output during the loss calculation. The expected output size is (30552) while the actual output size was (4). I am not sure where I should change the code.
loss=torch.log(prob[:,0]+1e-10)labels+torch.log((1-prob)[:,0]+1e-10)(1-labels)
RuntimeError: The size of tensor a (30522) must match the size of tensor b (4) at non-singleton dimension 1
When I use the openai-gpt model type, this causes an stopiteration error.
attention_mask = attention_mask.to(dtype=next(self.parameters()).dtype) # fp16 compatibility
StopIteration
It would be much helpful if you can provide necessary instructions on how to finetune using other model types given in the model class? I also could not find the implementation for TextCNN and BiLSTM models where the evaluation results are shown?
Hi! I am trying to fine tune the defect detection task. Currently, the default set up is using roberta model type with microsoft/codebert as the model path. However, if I want to change it to a different model type which is under your model classes, I get few errors.
First, when I use the distilbert model type, it causes an dimension error in the output during the loss calculation. The expected output size is (30552) while the actual output size was (4). I am not sure where I should change the code.
loss=torch.log(prob[:,0]+1e-10)labels+torch.log((1-prob)[:,0]+1e-10)(1-labels) RuntimeError: The size of tensor a (30522) must match the size of tensor b (4) at non-singleton dimension 1
When I use the openai-gpt model type, this causes an stopiteration error. attention_mask = attention_mask.to(dtype=next(self.parameters()).dtype) # fp16 compatibility StopIteration
It would be much helpful if you can provide necessary instructions on how to finetune using other model types given in the model class? I also could not find the implementation for TextCNN and BiLSTM models where the evaluation results are shown?
Thank you