RuntimeError with SMART

xingjianz commented 2 years ago

I was trying to finetune on STS-B using SMART like follows:

python ../train.py --task_def ../experiments/glue/glue_task_def.yml --data_dir ${DATA_DIR} --init_checkpoint ${BERT_PATH} --batch_size ${BATCH_SIZE} --output_dir ${model_dir} --log_file ${log_file} --train_datasets ${train_datasets} --test_datasets ${test_datasets} --adv_train --adv_opt 1 The following error pops out.

Traceback (most recent call last):
  File "../train.py", line 476, in <module>
    main()
  File "../train.py", line 438, in main
    model.update(batch_meta, batch_data)
  File "/content/drive/MyDrive/NLP/smart-roberta/mt-dnn/mt_dnn/model.py", line 259, in update
    adv_loss, emb_val, eff_perturb = self.adv_teacher.forward(*adv_inputs)
  File "/content/drive/MyDrive/NLP/smart-roberta/mt-dnn/mt_dnn/perturbation.py", line 85, in forward
    adv_logits = model(*vat_args)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/drive/MyDrive/NLP/smart-roberta/mt-dnn/mt_dnn/matcher.py", line 127, in forward
    if fwd_type == 3:

RuntimeError: Boolean value of Tensor with more than one value is ambiguous'

I think the problem is that at line 106 of perturbation.py:

for step in range(0, self.K):
            vat_args = [
                input_ids,
                token_type_ids,
                attention_mask,
                premise_mask,
                hyp_mask,
                task_id,
                2,
                embed + noise,
            ]

The arguments do not match the forward function at line 165 in matcher.py:

def forward(
        self,
        input_ids,
        token_type_ids,
        attention_mask,
        premise_mask=None,
        hyp_mask=None,
        task_id=0,
        y_input_ids=None,
        fwd_type=0,
        embed=None,
    ):

I think the 'y_input_ids' is missing in perturbation.py but I am not sure how to fix it. Any solutions or suggestions? Thanks!

natuhvnh commented 2 years ago

I have the same issue. Have you fixed it ?

natuhvnh commented 2 years ago

@namisan when set adv_train, I also have this issue. Do you have any solution?

Traceback (most recent call last):
  File "train.py", line 776, in <module>
    main()
  File "train.py", line 663, in main
    model.update(batch_meta, batch_data)
  File "/home/tunguyen6/Test/text_similarity/mt-dnn/mt_dnn/model.py", line 327, in update
    adv_loss, emb_val, eff_perturb = self.adv_teacher.forward(*adv_inputs)
  File "/home/tunguyen6/Test/text_similarity/mt-dnn/mt_dnn/perturbation.py", line 124, in forward
    (delta_grad,) = torch.autograd.grad(
  File "/home/tunguyen6/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 226, in grad
    return Variable._execution_engine.run_backward(
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

namisan commented 2 years ago

can you share me your python/pytorch version? You can direct pull my docker to run the experiment.

natuhvnh commented 2 years ago

hi @namisan torch 1.9.1 and python 3.8.10

namisan / mt-dnn

RuntimeError with SMART #229