xmindflow / deformableLKA

[WACV 2024] Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image Segmentation
https://arxiv.org/abs/2309.00121
157 stars 12 forks source link

RuntimeError: The size of tensor a (71680) must match the size of tensor b (32768) at non-singleton dimension 1 #11

Closed infulenceyang closed 5 months ago

infulenceyang commented 5 months ago

Dear author, When I want to use synapse code to train kipa22 dataset, I meet this problem. But I don't know what's wrong with this. I do not know whether there is a problem with the size format of my data, or the parameters in the code need to be modified? Thanks for your help, may be this is a easy problem, but I don't know how to dealt with it. I processed the date just use the nnUnet's code, then I copy the processed data to this project's folder.

Traceback (most recent call last): File "d_lka_former/run/run_training.py", line 211, in main() File "d_lka_former/run/run_training.py", line 195, in main trainer.run_training() File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/training/network_training/d_lka_former_trainer_synapse.py", line 490, in run_training ret = super().run_training() File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/training/network_training/Trainer_synapse.py", line 321, in run_training super(Trainer_synapse, self).run_training() File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/training/network_training/network_trainer_synapse.py", line 485, in run_training l = self.run_iteration(self.tr_gen, True) File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/training/network_training/d_lka_former_trainer_synapse.py", line 295, in run_iteration output = self.network(data) File "/home/javier/conda3/envs/d_lka_net_3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/network_architecture/synapse/d_lka_former_synapse.py", line 153, in forward x_output, hidden_states = self.d_lka_former_encoder(x_in) File "/home/javier/conda3/envs/d_lka_net_3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/network_architecture/synapse/model_components.py", line 69, in forward x, hidden_states = self.forward_features(x) File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/network_architecture/synapse/model_components.py", line 56, in forward_features x = self.stages0 File "/home/javier/conda3/envs/d_lka_net_3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/javier/conda3/envs/d_lka_net_3d/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward input = module(input) File "/home/javier/conda3/envs/d_lka_net_3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/javier/Desktop/deformableLKA-main/3D/d_lka_former/network_architecture/synapse/transformerblock.py", line 60, in forward x = x + self.pos_embed RuntimeError: The size of tensor a (71680) must match the size of tensor b (32768) at non-singleton dimension 1

Leonngm commented 5 months ago

Hi, Thanks for testing our code. It seems to be the case that your position embedding size does not match your feature size. Could you please give me the input size that you are using and also the downsample factor? E.g. we used an input size of 64x128x128 and a downsample factor of (2,4,4) to get embedding features of size 32x32x32.

infulenceyang commented 5 months ago

Thank you for your response! Some data normalizations I didn't process well. The 3D dataset I used doesn't have uniform format. I will do some process and test for your suggestion. I'm embarrassed to say that, for the reproduce process, I didn't do much change for the code ( including the dataload and the model arguments ). This is the first time for me to process the 3D dataset. The following data is the input size I copyed from the json foloder which is generated during the processing procedure with nnUnet. (216, 138, 138), (216, 138, 138), (182, 130, 130), (155, 116, 116), (198, 171, 171), (198, 174, 174), (194, 152, 152), (185, 147, 147), (209, 161, 161), (204, 176, 176), (170, 144, 144), (182, 130, 130), (231, 145, 145), (201, 176, 176), (202, 142, 142), (179, 161, 161), (230, 169, 169), (192, 133, 133), (205, 155, 155), (212, 162, 162), (175, 167, 167), (198, 171, 171), (155, 116, 116), (217, 173, 173), (198, 159, 159), (180, 142, 142), (198, 156, 156), (231, 150, 150), (186, 160, 160), (192, 153, 153), (184, 152, 152), (205, 160, 160), (201, 167, 167), (198, 171, 171), (231, 169, 169), (220, 175, 175) Thank you for your help. I will try my best to make the code run according to your prompts. And after the problem be solved, I will closed the issue.

Leonngm commented 5 months ago

The sizes that you copied, are the input sizes of your data. For larger 3D data, not the whole data is handed to the network, but instead a smaller chunk. Could you please give me the chunk size, that you specified? By default it is 64x128x128.

The chunk size/ img size is specified here and the embedding size in the variable embedding_patch_size.

I hope this is helpful.

infulenceyang commented 5 months ago

Their are some arg:

for the chunk size arg:

    self.crop_size = [64, 128, 128]
    self.input_channels = self.plans['num_modalities']
    self.num_classes = self.plans['num_classes'] + 1
    self.conv_op = nn.Conv3d
    self.embedding_dim = 192
    self.depths = depths #[2, 2, 2, 2]
    self.num_heads = [6, 12, 24, 48]
    self.embedding_patch_size = [2, 4, 4]
    self.window_size = [4, 4, 8, 4]
    self.deep_supervision = True
    self.trans_block = trans_block
    self.skip_connections = skip_connections

for the model: def create_model(name ='dlka_former'):

Network definition

    if name == 'dlka_former':
        net = D_LKA_Net(in_channels=1, 
                       out_channels=num_classes, 
                       img_size=[96, 96, 96],
                       patch_size=(2,2,2),
                       input_size=[48*48*48, 24*24*24,12*12*12,6*6*6],
                       trans_block=TransformerBlock_3D_single_deform_LKA,
                       do_ds=False)
        model = net.cuda()
    return model

It seems that I didn't modify any arguments. One reason I came to your for help is that I couldn't find the dataload code/(ㄒoㄒ)/~~ Your help for the link of chunk size, is the first time I know about the code location. And honestly say, I know this kind of hand-holding for rookies is laborious and unnecessary. Thank you again from the bottom of my heart. If you don't mind, add wechat with me to send you a cup of milk tea, may make me feel more comforted for your help

infulenceyang commented 5 months ago

no no no, I just copyed the wrong arguments( the pancreas model). I'm sorry for that, wait a moment

infulenceyang commented 5 months ago

oh! May be I find what should I do. I should change the patch_size in the pkl folder, right? Change the patch_size into [64, 128, 128] too? Let me try!

infulenceyang commented 5 months ago

Thank you!!! I finished the reproduction!The coding is running. Thank you very much! Maybe in the future, I will cite this paper! And now, let me close this issue. Really thank you very much.