CUDA error at cudaMemcpy3D(&myparms)

Masaaki-75 commented 1 year ago

I was using torch-radon v2 branch (which is installed using: wget -qO- https://raw.githubusercontent.com/matteo-ronchetti/torch-radon/v2/auto_install.py | python - provided in Issue #23).

But when I try to run some test code I got the following error:

CUDA error at cudaMemcpy3D(&myparms) (src/texture.cu:167) error code: 1, error string: invalid argument

The GPU was not occupied (so it shouldn't be a problem of OOM?), and my test code is like the following:

import torch
import numpy as np
from torch_radon.radon import FanBeam

if __name__ == '__main__':
    import os
    import numpy as np
    img = np.random.randn(256, 256)

    os.environ['CUDA_VISIBLE_DEVICES'] = '4'
    print(torch.cuda.device_count())
    print(torch.cuda.is_available())
    angles = np.linspace(0, 2 * np.pi, 360)
    img = torch.from_numpy(img).to("cuda").float().reshape(1, 1, 256, 256)
    print(img.shape, img.min(), img.max())

    tool = FanBeam(det_count=768, angles=angles, src_dist=1075)
    print('FanBeam tool initialized')
    print(tool.forward(img, angles))

The full error messages are:

1
True
torch.Size([1, 1, 256, 256]) tensor(-4.5295, device='cuda:0') tensor(4.8204, device='cuda:0')
FanBeam tool initialized
CUDA error at cudaMemcpy3D(&myparms) (src/texture.cu:167) error code: 1, error string: invalid argument

Any advice on solving this issue? Thanks!

Masaaki-75 commented 1 year ago

btw here's info on my environment:

OS: Linux version 3.10.0-957.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) )

CPU: Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
GPU: Tesla V100 SXM2 32GB
Python: 3.7.16
Pytorch: 1.7.1
nvcc-V: Cuda compilation tools, release 9.0, V9.0.176
CUDA: 11.6
Driver version: 510.47.03

Masaaki-75 commented 1 year ago

Update:

I solved this problem!

It seems like in v2 branch the det_count should be in line with img_size, which is different from v1 version where they are 2 separate arguments.

The following code raises no error:

import torch
import numpy as np
from torch_radon.radon import FanBeam

if __name__ == '__main__':
    import os
    import numpy as np

    img = np.random.randn(256, 256)
    os.environ['CUDA_VISIBLE_DEVICES'] = '4'
    device = torch.device('cuda')
    print(torch.cuda.device_count())
    print(torch.cuda.is_available())
    print(device)
    angles = np.linspace(0, 2 * np.pi, 360)
    img = torch.from_numpy(img).to("cuda").float().reshape(1, 1, 256, 256)
    print(img.shape, img.min(), img.max())

    radon = FanBeam(det_count=256, angles=angles, src_dist=1075)
    print('FanBeam tool initialized')
    with torch.no_grad():
        sinogram = radon.forward(img)
        filtered_sinogram = radon.filter_sinogram(sinogram)
        fbp = radon.backprojection(filtered_sinogram)
    print(sinogram.shape, filtered_sinogram.shape, fbp.shape)

The stdout message looks like:

1
True
cuda
torch.Size([1, 1, 256, 256]) tensor(-4.0462, device='cuda:0') tensor(4.5243, device='cuda:0')
FanBeam tool initialized
torch.Size([1, 1, 360, 256]) torch.Size([1, 1, 360, 256]) torch.Size([1, 1, 256, 256])

Masaaki-75 commented 1 year ago

Update 2:

v2 branch also supports different img_size and det_count, but the way to specify them is different. In v2 branch, we specify the img_size using Volume2D class, which is better because it supports non-squared input image!

For those who need this like me (I think this is often the case since there will be a rounded-shape artifact around the backprojected image if det_count <= img_size), here's the example:

from torch_radon import RadonFanbeam, FanBeam
# Importing RadonFanbeam in v2 branch shall raise a deprecation warning

img = np.load('phantom.npy')
img_shape = img.shape[-2:]  # (512, 512)
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
device = torch.device('cuda')
print(torch.cuda.device_count())
print(torch.cuda.is_available())
print(device)
angles = np.linspace(0, 2 * np.pi, 360)
img = torch.from_numpy(img).to("cuda").float().reshape(1, 1, *img_shape)
print(img.shape, img.min(), img.max())

# Here's how we do it in old version
radon = RadonFanbeam(img_shape[0], angles=angles, source_distance=1075, det_count=768)
with torch.no_grad():
    sinogram = radon.forward(img)
    filtered_sinogram = radon.filter_sinogram(sinogram)
    fbp = radon.backprojection(filtered_sinogram)
print('fan beam v1 det_count>img_size: ', sinogram.shape, filtered_sinogram.shape, fbp.shape)
plt.imshow(fbp.squeeze().detach().cpu(), cmap='gray')
plt.savefig('fan_v1_large.png')

# Here's how we do it in new version
volume = Volume2D(height=img_shape[0], width=img_shape[1])
print(volume)
radon = FanBeam(768, angles=angles, src_dist=1075, volume=volume)
with torch.no_grad():
    sinogram = radon.forward(img)
    filtered_sinogram = radon.filter_sinogram(sinogram)
    fbp = radon.backprojection(filtered_sinogram)
print('fan beam v2 det_count>img_size: ', sinogram.shape, filtered_sinogram.shape, fbp.shape)
plt.imshow(fbp.squeeze().detach().cpu(), cmap='gray')
plt.savefig('fan_v2_large_volume.png')

# The following will produce an image with artifact
radon = FanBeam(img_shape[0], angles=angles, src_dist=1075)
with torch.no_grad():
    sinogram = radon.forward(img)
    filtered_sinogram = radon.filter_sinogram(sinogram)
    fbp = radon.backprojection(filtered_sinogram)
print('fan beam v2: ', sinogram.shape, filtered_sinogram.shape, fbp.shape)
plt.imshow(fbp.squeeze().detach().cpu(), cmap='gray')
plt.savefig('fan_v2.png')

# The following will raise CUDA error:
#radon = FanBeam(768, angles=angles, src_dist=1075)
#with torch.no_grad():
#    sinogram = radon.forward(img)
#    filtered_sinogram = radon.filter_sinogram(sinogram)
#    fbp = radon.backprojection(filtered_sinogram)

fan_v1_large.png looks like (det_count=768, img_size=512):

fan_v2_large_volume.png looks like (det_count=768, img_size=512):

fan_v2.png looks like (det_count=512, img_size=512): fan_v2

henrytanbo commented 2 months ago

您好，我最近也在安装torch-radon，v2版本我在python3.9，torch1.13的版本下安装成功了，我的显卡型号是rtx4070tisuper 16G，在调用FanBeam的时候报了下面这个错误，是否跟您前面的错误类似 error

liyifan2002 commented 4 weeks ago

您好，我最近也在安装torch-radon，v2版本我在python3.9，torch1.13的版本下安装成功了，我的显卡型号是rtx4070tisuper 16G，在调用FanBeam的时候报了下面这个错误，是否跟您前面的错误类似

The same question, have you solved it?

12345678901234567800001882277777 commented 2 weeks ago

您好，我最近也在安装torch-radon，v2版本我在python3.9，torch1.13的版本下安装成功了，我的显卡型号是rtx4070tisuper 16G，在调用FanBeam的时候报了下面这个错误，是否跟您前面的错误类似

The same question, have you solved it?

The same question, have you solved it?！！！！！！I am also facing this problem and would like to seek some advice.

Masaaki-75 commented 2 weeks ago

Hi, Guys. From my experience, this can be attributed to various reason:

a batch size too large, or any other reason that causes OOM;
incompatible geometry arguments (such as det_count and img_size, which sometimes should not be an issue;
invalid input tensor shape. make sure it is either [B,C,H,W] or [H,W];
...

Since matteo-ronchetti is no longer working on this project, I recommend using a more stable v2 version maintained by @carterbox (he also opened an issue in https://github.com/matteo-ronchetti/torch-radon/issues/57). You can find a simple installation walkthrough in https://github.com/Masaaki-75/proct/blob/main/inst_tr.sh#L38

And here's how i use this new torch-radon package to explore different features: https://github.com/Masaaki-75/proct/blob/main/wrappers/basic_wrapper_v2.py

12345678901234567800001882277777 commented 2 weeks ago

Thank you for your answer. When I downloaded the torch-radon package, since my server is a 4090, I downloaded it on another 3090 and then copied it to my 4090 for use. Could this have any impact？My code test is as follows： The relevant parameters and errors are as follows： I've been troubled by this issue for a long time, and I'm still a beginner, so I appreciate your patience.

Masaaki-75 commented 2 weeks ago

I downloaded it on another 3090 and then copied it to my 4090 for use

This is a bit confusing. how do you set up the virtual environment? did you download the package and re-compile it on the 4090 machine? to my knowledge, simply copying the whole virtual environment in anaconda would not work.

For your error, could you share more information on how you implement your fp function? maybe we need to check if the image size and the detector arguments are properly set.

12345678901234567800001882277777 commented 2 weeks ago

This is a bit confusing. how do you set up the virtual environment? did you download the package and re-compile it on the 4090 machine? to my knowledge, simply copying the whole virtual environment in anaconda would not work.

No, I just copied the installed torch-radon to my virtual environment and tested it with the following code. For your error, could you share more information on how you implement your fp function? maybe we need to check if the image size and the detector arguments are properly set. fp in my code refers to my forward projection function

Masaaki-75 commented 2 weeks ago

I am still not sure how you installed the torch-radon. Let me be specific: if you downloaded a repo from matteo to your workspace, and then built from source (that means, cd to the repo and run python setup.py install to install). Direct copy may not work, and simple import may not find the potential error.

I suggest testing with a more detailed code (that means, not only testing the import but also try some forward/backward demo). For example, see: here or here. you may need some modification to run, as the links above are another stable v2 version (which is also the version i am using) different from matteo's source code.

By "implement" i mean the arguments related to the forward/backward projection process (det_count, img_size, etc, defined in radon). would be better if you can print the radon things out, as well as the actual image shape (resize1(Xgt).shape), since the current code you show does not provide much information.

12345678901234567800001882277777 commented 2 weeks ago

I am still not sure how you installed the torch-radon. Let me be specific: if you downloaded a repo from matteo to your workspace, and then built from source (that means, cd to the repo and run python setup.py install to install). Direct copy may not work, and simple import may not find the potential error.

I suggest testing with a more detailed code (that means, not only testing the import but also try some forward/backward demo). For example, see: here or here. you may need some modification to run, as the links above are another stable v2 version (which is also the version i am using) different from matteo's source code.

By "implement" i mean the arguments related to the forward/backward projection process (det_count, img_size, etc, defined in radon). would be better if you can print the radon things out, as well as the actual image shape (resize1(Xgt).shape), since the current code you show does not provide much information.

Okay, thank you very much for your guidance. Additionally, I would like to ask if the v2 version supports the RTX 4090?

Masaaki-75 commented 2 weeks ago

For carterbox's reimplementation, the answer is yes. i successfully installed this by building from source on rtx 3090 and 4090 (CUDA>=11.5). You can also use conda to install the precompiled release if your server supports higher CUDA version.

For matteo's original version, i haven't explored too much due to unclear and tricky errors.

12345678901234567800001882277777 commented 2 weeks ago

For carterbox's reimplementation, the answer is yes. i successfully installed this by building from source on rtx 3090 and 4090 (CUDA>=11.5). You can also use conda to install the precompiled release if your server supports higher CUDA version.

For matteo's original version, i haven't explored too much due to unclear and tricky errors.

Thank you very much, I was able to successfully run with the v2 version of torch-radon, and thank you again for your patience.

matteo-ronchetti / torch-radon

CUDA error at cudaMemcpy3D(&myparms) #47