taozh2017 / HiNet

Code for TMI 2020 "Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis"
86 stars 20 forks source link

IndexError: too many indices for array #7

Closed liyiersan closed 3 years ago

liyiersan commented 4 years ago

Sir, so sorry to bother you again. When I tried to run the code with the data example in data directory, there was problem in dataset.py, line 28, "img_t1 = img_t1[40:200, 20:200, :]". I print the shape of img_t1, which is (240,240). But the code in this line is to operate a 3-D ndarray, while img_t1 is a 2-D ndarray. And the paper of HiNet shows "For a 2D slice (240 × 240), we crop out an image of size 160 × 180 from the center region." So, what should I do to avoid this error, change the code or the data format? So sorry for my poor English and programming ability. And thanks again for your patient guidance.

liyiersan commented 4 years ago

Another two problems in HiNet_SynthModel.py when train:

Traceback (most recent call last):
  File "/home/alex/Desktop/medical_image/model/HiNet/main.py", line 35, in <module>
    fire.Fire()
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/fire/core.py", line 138, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/fire/core.py", line 468, in _Fire
    target=component.__name__)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/fire/core.py", line 672, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/alex/Desktop/medical_image/model/HiNet/main.py", line 22, in train
    SynModel.train()
  File "/home/alex/Desktop/medical_image/model/HiNet/HiNet_SynthModel.py", line 130, in train
    loss_GAN = criterion_GAN(self.discrimator(x_fake), valid) 
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 445, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/torch/nn/functional.py", line 2648, in mse_loss
    ret = torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I debug and find that the devices of fake(line 105) and valid(line 106) are both 'cpu', so I change the codes in line 105 and line 106 like this:

       fake  = torch.zeros([inputs[0].shape[1]*inputs[0].shape[0],1,6,6], requires_grad=False).cuda()
       valid = torch.ones([inputs[0].shape[1]*inputs[0].shape[0],1,6,6], requires_grad=False).cuda()

After that, this error is solved.

When I run the code again, it seems that there are some errors when compute the gradient of loss_D in line 151, traceback as follows:

Traceback (most recent call last):
  File "/home/alex/Desktop/medical_image/model/HiNet/main.py", line 33, in <module>
    fire.Fire()
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/fire/core.py", line 138, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/fire/core.py", line 468, in _Fire
    target=component.__name__)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/fire/core.py", line 672, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/alex/Desktop/medical_image/model/HiNet/main.py", line 20, in train
    SynModel.train()
  File "/home/alex/Desktop/medical_image/model/HiNet/HiNet_SynthModel.py", line 151, in train
    loss_D.backward(retain_graph=True)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/alex/anaconda3/envs/pytorch_gpu/lib/python3.6/site-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 32, 3, 3]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I solve this by changing line147 like this:

loss_fake = criterion_GAN(self.discrimator(x_fake.detach()), fake)

And my enviroment is :

pytorch_gpu: 1.6.0
cuda: 10.2
cudnn: 7605
torchvison: 0.7.0
fire: 0.3.1
ubuntu: 16.04.7 LTS

What confused me is why I have these two problems, and did I fix them correctly?

WenLi-o00o commented 4 years ago

I met these problems too

taozh2017 commented 4 years ago

Sir, so sorry to bother you again. When I tried to run the code with the data example in data directory, there was problem in dataset.py, line 28, "img_t1 = img_t1[40:200, 20:200, :]". I print the shape of img_t1, which is (240,240). But the code in this line is to operate a 3-D ndarray, while img_t1 is a 2-D ndarray. And the paper of HiNet shows "For a 2D slice (240 × 240), we crop out an image of size 160 × 180 from the center region." So, what should I do to avoid this error, change the code or the data format? So sorry for my poor English and programming ability. And thanks again for your patient guidance.

The original volume is with the size of 240X240X155 (155个slices), then use img_t1[40:200, 20:200, :] and we can get 160X180X155, finally, we use each slice with a size of 160X180.

taozh2017 commented 4 years ago

I met these problems too

I do not meet this issue when I previously run it. I will check it and re-run this code on my computer. Thanks.

liyiersan commented 4 years ago

Sir, so sorry to bother you again. When I tried to run the code with the data example in data directory, there was problem in dataset.py, line 28, "img_t1 = img_t1[40:200, 20:200, :]". I print the shape of img_t1, which is (240,240). But the code in this line is to operate a 3-D ndarray, while img_t1 is a 2-D ndarray. And the paper of HiNet shows "For a 2D slice (240 × 240), we crop out an image of size 160 × 180 from the center region." So, what should I do to avoid this error, change the code or the data format? So sorry for my poor English and programming ability. And thanks again for your patient guidance.

The original volume is with the size of 240X240X155 (155个slices), then use img_t1[40:200, 20:200, :] and we can get 160X180X155, finally, we use each slice with a size of 160X180.

Sorry, I can not understand it. Well, the shape of img_t1 is 160X180X155, in dataset.py line 66, the code is:

     img_t1_patches = generate_all_2D_patches(img_t1)

in utils.py function generate_2D_patches(in_data) line 42, the code is:

     patches[count,:,:] = in_data[xx:xx+out_size[0],yy:yy+out_size[1]]

the shape of in_data(ie. img_t1) is 160X180X155, the shape of in_data[xx:xx+out_size[0],yy:yy+out_size[1]] is 128X128X155, but the shape of patches[count,:,:] is 128X128. And this causes an error "could not broadcast input array from shape (128,128,155) into shape (128,128)" . Should I change the shape of patches into a 4-d array?

WenLi-o00o commented 4 years ago

I changed the code from 3D volume to 2D slice then the code works, but there is also this problem: "RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 32, 3, 3]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True)."

I solved it like your method: loss_fake = criterion_GAN(self.discrimator(x_fake.detach()), fake)

Though the code can work, It seems the discriminator will not work.

tantantan007 commented 5 months ago

先生,很抱歉再次打扰您。当我尝试在数据目录中使用数据示例运行代码时,dataset.py 第 28 行“img_t1 = img_t1[40:200, 20:200, :]”出现问题。我打印img_t1的形状,即 (240,240)。但是这一行中的代码是操作一个 3-D ndarray,而 img_t1 是一个 2-D ndarray。HiNet的论文显示,“对于2D切片(240×240),我们从中心区域裁剪出大小为160×180的图像。那么,我应该怎么做才能避免此错误,更改代码或数据格式?很抱歉我的英语和编程能力很差。再次感谢您的耐心指导。

这个问题解决了吗,我也遇到这个问题了

liyiersan commented 5 months ago

@tantantan007 时间过得比较久了,具体的我不记得了。 数据预处理的大概思路是这样子: 1,把3D的数据转成2D的切片,brats原始数据数据是 240X240X155,这一步之后对于每个Volume你可以得到155个2D的slice 2,对于每个slice,先用[40:200, 20:200]进行crop,这样slice的shape就变成了160X180的crop的slice 3,对于每个crop的slice,变成4个128X128的patch。训练阶段读取数据的时候,你先读取crop的slice然后转成四个patch。 4, data norm是针对整个volume来做的,这一步应该在volume转slice之前。 推理阶段,大致处理方式同上,有几点需要注意: 1,可视化的时候,需要先把4个128X128的patch变成160X180的cropped slice。代码库里面有一个很棒的实现,先累加,然后用一个counter来计数,最后用sum的结果除以counter 2,可视化的结果需要反归一化,这里需要你提前设置好min和max value,不同的值对可视化结果的影响非常大。

tantantan007 commented 5 months ago

感谢感谢