matteo-ronchetti / torch-radon

Computational Tomography in PyTorch
https://torch-radon.readthedocs.io
GNU General Public License v3.0
218 stars 45 forks source link

CUDA error at cudaMemcpy3D(&myparms) #47

Open Masaaki-75 opened 1 year ago

Masaaki-75 commented 1 year ago

I was using torch-radon v2 branch (which is installed using: wget -qO- https://raw.githubusercontent.com/matteo-ronchetti/torch-radon/v2/auto_install.py | python - provided in Issue #23).

But when I try to run some test code I got the following error:

CUDA error at cudaMemcpy3D(&myparms) (src/texture.cu:167) error code: 1, error string: invalid argument

The GPU was not occupied (so it shouldn't be a problem of OOM?), and my test code is like the following:

import torch
import numpy as np
from torch_radon.radon import FanBeam

if __name__ == '__main__':
    import os
    import numpy as np
    img = np.random.randn(256, 256)

    os.environ['CUDA_VISIBLE_DEVICES'] = '4'
    print(torch.cuda.device_count())
    print(torch.cuda.is_available())
    angles = np.linspace(0, 2 * np.pi, 360)
    img = torch.from_numpy(img).to("cuda").float().reshape(1, 1, 256, 256)
    print(img.shape, img.min(), img.max())

    tool = FanBeam(det_count=768, angles=angles, src_dist=1075)
    print('FanBeam tool initialized')
    print(tool.forward(img, angles))

The full error messages are:

1
True
torch.Size([1, 1, 256, 256]) tensor(-4.5295, device='cuda:0') tensor(4.8204, device='cuda:0')
FanBeam tool initialized
CUDA error at cudaMemcpy3D(&myparms) (src/texture.cu:167) error code: 1, error string: invalid argument

Any advice on solving this issue? Thanks!

Masaaki-75 commented 1 year ago

btw here's info on my environment:

OS: Linux version 3.10.0-957.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) )

CPU: Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
GPU: Tesla V100 SXM2 32GB
Python: 3.7.16
Pytorch: 1.7.1
nvcc-V: Cuda compilation tools, release 9.0, V9.0.176
CUDA: 11.6
Driver version: 510.47.03
Masaaki-75 commented 1 year ago

Update:

I solved this problem!

It seems like in v2 branch the det_count should be in line with img_size, which is different from v1 version where they are 2 separate arguments.

The following code raises no error:

import torch
import numpy as np
from torch_radon.radon import FanBeam

if __name__ == '__main__':
    import os
    import numpy as np

    img = np.random.randn(256, 256)
    os.environ['CUDA_VISIBLE_DEVICES'] = '4'
    device = torch.device('cuda')
    print(torch.cuda.device_count())
    print(torch.cuda.is_available())
    print(device)
    angles = np.linspace(0, 2 * np.pi, 360)
    img = torch.from_numpy(img).to("cuda").float().reshape(1, 1, 256, 256)
    print(img.shape, img.min(), img.max())

    radon = FanBeam(det_count=256, angles=angles, src_dist=1075)
    print('FanBeam tool initialized')
    with torch.no_grad():
        sinogram = radon.forward(img)
        filtered_sinogram = radon.filter_sinogram(sinogram)
        fbp = radon.backprojection(filtered_sinogram)
    print(sinogram.shape, filtered_sinogram.shape, fbp.shape)

The stdout message looks like:

1
True
cuda
torch.Size([1, 1, 256, 256]) tensor(-4.0462, device='cuda:0') tensor(4.5243, device='cuda:0')
FanBeam tool initialized
torch.Size([1, 1, 360, 256]) torch.Size([1, 1, 360, 256]) torch.Size([1, 1, 256, 256])
Masaaki-75 commented 1 year ago

Update 2:

v2 branch also supports different img_size and det_count, but the way to specify them is different. In v2 branch, we specify the img_size using Volume2D class, which is better because it supports non-squared input image!

For those who need this like me (I think this is often the case since there will be a rounded-shape artifact around the backprojected image if det_count <= img_size), here's the example:

from torch_radon import RadonFanbeam, FanBeam
# Importing RadonFanbeam in v2 branch shall raise a deprecation warning

img = np.load('phantom.npy')
img_shape = img.shape[-2:]  # (512, 512)
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
device = torch.device('cuda')
print(torch.cuda.device_count())
print(torch.cuda.is_available())
print(device)
angles = np.linspace(0, 2 * np.pi, 360)
img = torch.from_numpy(img).to("cuda").float().reshape(1, 1, *img_shape)
print(img.shape, img.min(), img.max())

# Here's how we do it in old version
radon = RadonFanbeam(img_shape[0], angles=angles, source_distance=1075, det_count=768)
with torch.no_grad():
    sinogram = radon.forward(img)
    filtered_sinogram = radon.filter_sinogram(sinogram)
    fbp = radon.backprojection(filtered_sinogram)
print('fan beam v1 det_count>img_size: ', sinogram.shape, filtered_sinogram.shape, fbp.shape)
plt.imshow(fbp.squeeze().detach().cpu(), cmap='gray')
plt.savefig('fan_v1_large.png')

# Here's how we do it in new version
volume = Volume2D(height=img_shape[0], width=img_shape[1])
print(volume)
radon = FanBeam(768, angles=angles, src_dist=1075, volume=volume)
with torch.no_grad():
    sinogram = radon.forward(img)
    filtered_sinogram = radon.filter_sinogram(sinogram)
    fbp = radon.backprojection(filtered_sinogram)
print('fan beam v2 det_count>img_size: ', sinogram.shape, filtered_sinogram.shape, fbp.shape)
plt.imshow(fbp.squeeze().detach().cpu(), cmap='gray')
plt.savefig('fan_v2_large_volume.png')

# The following will produce an image with artifact
radon = FanBeam(img_shape[0], angles=angles, src_dist=1075)
with torch.no_grad():
    sinogram = radon.forward(img)
    filtered_sinogram = radon.filter_sinogram(sinogram)
    fbp = radon.backprojection(filtered_sinogram)
print('fan beam v2: ', sinogram.shape, filtered_sinogram.shape, fbp.shape)
plt.imshow(fbp.squeeze().detach().cpu(), cmap='gray')
plt.savefig('fan_v2.png')

# The following will raise CUDA error:
#radon = FanBeam(768, angles=angles, src_dist=1075)
#with torch.no_grad():
#    sinogram = radon.forward(img)
#    filtered_sinogram = radon.filter_sinogram(sinogram)
#    fbp = radon.backprojection(filtered_sinogram)

fan_v1_large.png looks like (det_count=768, img_size=512): fan_v1_large

fan_v2_large_volume.png looks like (det_count=768, img_size=512): fan_v2_large_volume

fan_v2.png looks like (det_count=512, img_size=512): fan_v2

henrytanbo commented 1 month ago

您好,我最近也在安装torch-radon,v2版本我在python3.9,torch1.13的版本下安装成功了,我的显卡型号是rtx4070tisuper 16G,在调用FanBeam的时候报了下面这个错误,是否跟您前面的错误类似 error

liyifan2002 commented 1 week ago

您好,我最近也在安装torch-radon,v2版本我在python3.9,torch1.13的版本下安装成功了,我的显卡型号是rtx4070tisuper 16G,在调用FanBeam的时候报了下面这个错误,是否跟您前面的错误类似 error

The same question, have you solved it?