Closed linnabraham closed 3 months ago
Hi @linnabraham, looks like a shape mismatch issue. Did you try to check your input data shape before sending to the model?
@KumoLiu I did now and it seems like the decathlon data shape is not compatible with the V-Net. I was not expecting that since I had earlier used the same data with tensorflow implementation of V-Net (https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/VNet).
But I found an issue with the V-Net implementation. It seems like the out_channels
is hard coded as 16. Which implied that the in_channels
could only be 16 or 1. I have re-opened an bug report that was closed without the proper fix here https://github.com/Project-MONAI/MONAI/issues/4896
Which implied that the
in_channels
could only be 16 or 1. I have re-opened an bug report that was closed without the proper fix here #4896
The in_channels
can be multiples of 16.
https://github.com/Project-MONAI/MONAI/blob/64ea76d83a92b7cf7f13c8f93498d50037c3324c/monai/networks/nets/vnet.py#L211
In Tensorflow, they also set out_channels as 16:
https://github.com/NVIDIA/DeepLearningExamples/blob/729963dd47e7c8bd462ad10bfac7a7b0b604e6dd/TensorFlow/Segmentation/VNet/model/vnet.py#L34
Thanks for pointing out the tensorflow code. But I am still confused. My input has shape (64, 128, 128). Right now I edited the source code to remove 16 from being hard coded, but no matter what I give as out_channel, 1, 16, 64, 128, I am getting a shape mismatch error. What do I do?
If your shape is (64, 128, 128), then your spatial_dims
should be 2 since 64 is the channel dim.
Thanks for pointing that out. I set it to 2. I could not use 16 as out channels, so I tried 64 itself. Now I get this error
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [32, 64, 128, 128, 1]
Hi @linnabraham, can you try removing EnsureChannelFirstd(keys=["image", "label"]),
such that transform
is given by
transform = Compose(
[
LoadImaged(keys=["image", "label"]),
ScaleIntensityd(keys="image"),
ToTensord(keys=["image", "label"]),
]
)
because it looks like EnsureChannelFirstd
adds an extra singleton dimension to the image, making it 3D instead of 2D.
Describe the bug PyTorch complains of size mismatch when using V-Net with medical decathlon data.
To Reproduce
Expected behavior Training happens
Screenshots
Complete Traceback
Environment
Ensuring you use the relevant python executable, please paste the output of: