RuntimeError: The size of tensor a (128) must match the size of tensor b (4) at non-singleton dimension 2

lukaszbinden commented 1 year ago

Hi Zhitong

Thanks a lot for releasing the code. I am trying to train the model on LIDC. After start, an error occurs immediately:

/storage/homefs/lz20w714/anaconda3/envs/mose/lib/python3.8/site-packages/scipy/init.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" Traceback (most recent call last): File "main.py", line 50, in model.train(data) File "/storage/homefs/lz20w714/git/mose-auseg/engine.py", line 52, in train self.validate(data) File "/storage/homefs/lz20w714/git/mose-auseg/engine.py", line 112, in validate metrics,prediction,prob = self.net.forward(patch_arrangement, masks_arrangement, prob_gt, val = True) File "/storage/homefs/lz20w714/anaconda3/envs/mose/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward return self.module(*inputs[0], *kwargs[0]) File "/storage/homefs/lz20w714/anaconda3/envs/mose/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, *kwargs) File "/storage/homefs/lz20w714/git/mose-auseg/models/MoSE.py", line 146, in forward metric = metrics.cal_metrics_batch((pred.argmax(2)).long(), (label).long(), sample_probs, prob_gt, File "/storage/homefs/lz20w714/git/mose-auseg/utils/metrics.py", line 102, in cal_metrics_batch d_sy = get_cost_matrix(sample_arr, gt_arr, M, N, d_sy, label_range=label_range) File "/storage/homefs/lz20w714/git/mose-auseg/utils/metrics.py", line 46, in get_cost_matrix cij = (dist_fct(sample_arr[:, i, ...], gt_arr[:, j, ...], label_range=label_range)) File "/storage/homefs/lz20w714/git/mose-auseg/utils/metrics.py", line 18, in iou_dist intersection = torch.sum(m1 m2, dim=[-1, -2]) # keep batch and class dimension RuntimeError: The size of tensor a (128) must match the size of tensor b (4) at non-singleton dimension 2

Thanks in advance for your help.

lukaszbinden commented 1 year ago

found a fix: I commented out line 47 in data/lidc_dataset.py:

# y = y.transpose(2,0,1)

now it's training.

lukaszbinden commented 1 year ago

for reference, I was able to train and test the model on LIDC, and got these results:

2023-05-17 09:19:01,785 INFO ****Running Experiment: MoSE_run**** 2023-05-17 09:19:07,087 INFO model size: 41.60 / MB 2023-05-17 09:19:07,608 INFO Loading model ./logs/lidc/MoSE_run/MoSE_run_best_ged.pth 2023-05-17 09:20:27,032 INFO - Mean GED: 0.210 2023-05-17 09:20:27,033 INFO - Mean M-IoU: 0.621 2023-05-17 09:20:27,033 INFO - Mean ECE: 0.083%

gaozhitong commented 1 year ago

Hi, Lukas. Thank you very much for your feedback on the bug, I will fix it later. I apologize for the late response since I am really busy last week preparing my NeurIPS submission. Feel free to contact me if there is any other problem.

gaozhitong / MoSE-AUSeg

RuntimeError: The size of tensor a (128) must match the size of tensor b (4) at non-singleton dimension 2 #1