atharsefid / SciBERTSUM

10 stars 2 forks source link

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. #2

Open SJLMax opened 2 years ago

SJLMax commented 2 years ago
Traceback (most recent call last):
  File "train.py", line 108, in <module>
    train_ext(args, device_id)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/train_extractive.py", line 64, in train_ext
    train_single_ext(args, device_id)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/train_extractive.py", line 149, in train_single_ext
    trainer.train(train_iter_fct, args.train_steps)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/trainer_ext.py", line 138, in train
    self._gradient_accumulation( # this is the main function that calculates the loss
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/trainer_ext.py", line 318, in _gradient_accumulation
    sent_scores, mask = self.model(src, sections, token_sections, segs, clss, mask, mask_cls)
  File "/home/shangjl/anaconda3/envs/scibert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/model_builder.py", line 135, in forward
    sent_scores = self.ext_layer(inputs_embeds, sections, attention_mask, extended_attention_mask).squeeze(-1)
  File "/home/shangjl/anaconda3/envs/scibert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/longExtractiveFormer.py", line 148, in forward
    x = self.transformer_inter[i](i, x,
  File "/home/shangjl/anaconda3/envs/scibert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/longExtractiveFormer.py", line 108, in forward
    output = self.self_attn(input_norm,
  File "/home/shangjl/anaconda3/envs/scibert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/longExtractiveFormerAttention.py", line 718, in forward
    self_outputs = self.self(
  File "/home/shangjl/anaconda3/envs/scibert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/longExtractiveFormerAttention.py", line 238, in forward
    attn_output = self._compute_attn_output_with_global_indices(
  File "/home/shangjl/CAIL/SciBERTSUM.git/trunk/src/models/longExtractiveFormerAttention.py", line 545, in _compute_attn_output_with_global_indices
    value_vectors_only_global[is_local_index_global_attn_nonzero] = value_vectors[is_index_global_attn_nonzero].detach().numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

value_vectors_only_global[is_local_index_global_attn_nonzero] = value_vectors[is_index_global_attn_nonzero].detach().cpu().numpy() still didnt work

harshita-chopra commented 1 year ago

Hi, this change should help.

value_vectors_only_global = torch.Tensor(np.zeros([batch_size, max_num_global_attn_indices, self.num_heads, self.head_dim])).detach().cpu()

value_vectors_only_global[is_local_index_global_attn_nonzero] = value_vectors[is_index_global_attn_nonzero].detach().cpu()
value_vectors_only_global = value_vectors_only_global.numpy()
oblialbum commented 1 year ago

Maybe you should check your _tensor.py file where exists code " return self.numpy()" ,convert it to "return self.cpu().numpy()"