yangyangxu0 / DeMT

66 stars 7 forks source link

Shape mismatched #4

Open hongvin opened 1 year ago

hongvin commented 1 year ago

Thank you for open-sourcing the code! Great job!

However, when I run the code with the following:

src/main.py --cfg config/t-nyud/swin/baselinemt_tiny_demt.yaml --datamodule.data_dir /lustre/user/hongvin/MTL/NYUD/ --trainer.gpus 0

I get the error:

RuntimeError: shape '[8, 59920, 64]' is invalid for input of size 122716160

When taking 122716160/(8 59920 64), it return exactly 4. So, I wonder if there is a missing dimension in between?

EDIT: Looks like

https://github.com/yangyangxu0/DeMT/blob/e5dc5b1debbcfa784812f84acf577813e9a5f0b9/src/model/heads/demt_head.py#L56-L59

gives out shape of [8, 14980, 256]

while

https://github.com/yangyangxu0/DeMT/blob/e5dc5b1debbcfa784812f84acf577813e9a5f0b9/src/model/heads/demt_head.py#L61

concat them and gives [8, 59920, 256]

Therefore,

https://github.com/yangyangxu0/DeMT/blob/e5dc5b1debbcfa784812f84acf577813e9a5f0b9/src/model/heads/demt_head.py#L68

outs[ind] is [8, 14980, 256] task_cat is [8, 59920, 256]

Thank you.

654493176 commented 1 year ago

I come across the same problem, have you solve the problem?

yangyangxu0 commented 1 year ago

inp = outs[ind] + self.smlp2[ind](task_query(outs[ind] ,task_cat, task_cat)[0]) q = task_query(q, k, v) # input: [q: (8, 14980, 256), v: (8, 59920, 256), k: (8, 59920, 256)]; Returns: q: (8, 14980, 256) The above process is correct (i.e., query-based attention). The reason you get this error is probably that the Torch version is not correct. We use pytorch-lightning ==1.1.8 and torch==1.7.0 Hope these can help you.

654493176 commented 1 year ago

Thanks for your reply,but according to the PyTorch1.7.0 doc.,the first dim is the sequence length: image but yours is batch size: image Is it an error of my version or just a writing mistake?

654493176 commented 1 year ago

In addition,if the dim is right, this means self-attention is computed for 59,920 vectors with 256-dimensional,it requires hundreds of GB of GPU memory. Did I misunderstand something? Please correct me.

shenxiangkei commented 1 year ago

Thank you for open-sourcing the code! Great job!

However, when I run the code with the following:

src/main.py --cfg config/t-nyud/swin/baselinemt_tiny_demt.yaml --datamodule.data_dir /lustre/user/hongvin/MTL/NYUD/ --trainer.gpus 0

I get the error:

RuntimeError: shape '[8, 59920, 64]' is invalid for input of size 122716160

When taking 122716160/(8 59920 64), it return exactly 4. So, I wonder if there is a missing dimension in between?

EDIT: Looks like

https://github.com/yangyangxu0/DeMT/blob/e5dc5b1debbcfa784812f84acf577813e9a5f0b9/src/model/heads/demt_head.py#L56-L59

gives out shape of [8, 14980, 256]

while

https://github.com/yangyangxu0/DeMT/blob/e5dc5b1debbcfa784812f84acf577813e9a5f0b9/src/model/heads/demt_head.py#L61

concat them and gives [8, 59920, 256]

Therefore,

https://github.com/yangyangxu0/DeMT/blob/e5dc5b1debbcfa784812f84acf577813e9a5f0b9/src/model/heads/demt_head.py#L68

outs[ind] is [8, 14980, 256] task_cat is [8, 59920, 256]

Thank you.

I'm having the same problem. Have you solved it? Thank you.

Joshuat38 commented 9 months ago

To fix the problem, you have to set batch_first=True wherever nn.MultiheadAttention is used. Recent releases of PyTorch assumes batch is the second dimension. So in the example below, you should add batch_first=True to the MultiheadAttention class.

https://github.com/yangyangxu0/DeMT/blob/b638e140b0bf4c37e119f2dd8b9945386103a9fc/src/model/heads/demt_head.py#L47

wangyuankl123 commented 8 months ago

To fix the problem, you have to set batch_first=True wherever nn.MultiheadAttention is used. Recent releases of PyTorch assumes batch is the second dimension. So in the example below, you should add batch_first=True to the MultiheadAttention class.

https://github.com/yangyangxu0/DeMT/blob/b638e140b0bf4c37e119f2dd8b9945386103a9fc/src/model/heads/demt_head.py#L47

Have you solved this problem? Thank you.