YuanGongND / ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
1.06k stars 202 forks source link

Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor #84

Open michelle-chou25 opened 1 year ago

michelle-chou25 commented 1 year ago

1666232927523

Dear Yuan,

I met this issue when running the demo.py, it occurred in line 29, ast_models.py, self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=patch_size) with error msg as followed: Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor. Would you like to have a look at it? I use 👍 timm=0.4.5 torch = 1.10.1+cu102
torchaudio = 0.10.1+cu102
torchvision = 0.11.2+cu102

Thank you Best Regards, Nanjun

YuanGongND commented 1 year ago

Hi Nanjun,

This typically means your input and model are not on the same device (i.e., one on CPU, another on GPU), which can be solved by

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
input = model.to(device)

May I ask which demo script you are running? We have a colab demo at https://colab.research.google.com/github/YuanGongND/ast/blob/master/colab/AST_Inference_Demo.ipynb, which should be bug-free.

-Yuan

michelle-chou25 commented 1 year ago

Dear Yuan,

The file I run was src/demo.py, I also run the jupyter notbook demo and didn't have this issue. I debug the code, in self.proj(x), x.mlp_head.weight is in cuda, but when self.proj(x) is executed thiserror occurrs.

Best Regards, Nanjun

michelle-chou25 commented 1 year ago

Dear Yuan,

The file I run was src/demo.py, I also run the jupyter notbook demo and didn't have this issue. I debug the code, in self.proj(x), x.mlp_head.weight is in cuda, but when self.proj(x) is executed thiserror occurrs.

Best Regards, Nanjun

And this error is still there after I set both the model and input to cuda. I'll check it again by change cudatoolkit to another version.

YuanGongND commented 1 year ago

What if you run the jupyter script with your environment instead of the Google Colab one? If no error, then it's not your environment's problem.

YuanGongND commented 1 year ago

I also think setting the pretrain flag could also help:

ast_mdl = ASTModel(label_dim=label_dim, input_tdim=input_tdim, imagenet_pretrain=False, audioset_pretrain=False)

michelle-chou25 commented 1 year ago

I failed to run the Jupiter script on my local machine, it said it can't find the path '/content/ast/', seems my IDE failed to connect to colab.

YuanGongND commented 1 year ago

Yes, you need to change the filepath and maybe something else to run on local machine.

michelle-chou25 commented 1 year ago

Thank you it solves the issue and may I know why? I also tried changing x to x.half(), a different error msg as followed occurred: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper__thnn_conv2d_forward)

YuanGongND commented 1 year ago

I think this again means your input and model are not in the same device. Which specific method solved your issue?

michelle-chou25 commented 1 year ago

I think this again means your input and model are not in the same device. Which specific method solved your issue? Disable both imigenet_pretrained and audioset_pretrained

YuanGongND commented 1 year ago

The reason is it avoids the pretrained weights being load to cpu. No one reported this issue before about the input/model device, maybe not many people actually ran this demo. But since you have GPU, you could try run ESC-50 recipe and see if the error still there. I don't think cuda/torch version is the problem.

michelle-chou25 commented 1 year ago

I tried it on another machine. it was not reproduced.

michelle-chou25 commented 1 year ago

But changed line 132 in ast_models.py to
self.mlp_head = nn.Sequential(nn.LayerNorm(self.original_embedding_dim), nn.Linear(self.original_embedding_dim, label_dim)).to("cuda") and line 18 in demo.py to test_input = torch.rand([10, input_tdim, 128]).to("cuda").half()

michelle-chou25 commented 1 year ago

But changed line 132 in ast_models.py to self.mlp_head = nn.Sequential(nn.LayerNorm(self.original_embedding_dim), nn.Linear(self.original_embedding_dim, label_dim)).to("cuda") and line 18 in demo.py to test_input = torch.rand([10, input_tdim, 128]).to("cuda").half()

In the previous machine, the error can still be reproduced by applying the workaround.

YuanGongND commented 1 year ago

I see, it is a bit weird to me. Thanks for reporting this.

I actually don't think .half() is needed though the model is trained with half-precision - it should work for all float tensor input. You can do a quick test in the Google Colab environment to see if it is true.

Let's see if anyone else has the same issue.