py4dstem / py4DSTEM

GNU General Public License v3.0
199 stars 135 forks source link

Cupy or CUDA error #628

Closed ZOUCHEN158 closed 5 months ago

ZOUCHEN158 commented 5 months ago

Describe the bug A clear and concise description of what the bug is,

To Reproduce Steps to reproduce the behavior, please be as general as possible, and ideally recreate a minimal reproducible example:

Expected behavior A clear and concise description of what you expected to happen.

py4DSTEM version It can be accessed by running py4DSTEM.__version__0.14.9

Python version 3.12

Operating system Windows,

GPU If GPU related please provide:

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\py4DSTEM\process\phase\phase_base_class.py:1882, in PtychographicReconstruction._sum_overlapping_patches_bincounts_base(self, patches, positions_px) 1878 flat_weights = patches.ravel() 1879 indices = ((y0[:, None, None] + y_ind[None, None, :]) % object_shape[1]) + ( 1880 (x0[:, None, None] + x_ind[None, :, None]) % object_shape[0] 1881 ) * object_shape[1] -> 1882 counts = xp.bincount( 1883 indices.ravel(), weights=flat_weights, minlength=np.prod(object_shape) 1884 ) 1885 counts = xp.reshape(counts, object_shape).astype(xp.float32) 1886 return counts

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\cupy_statistics\histogram.py:507, in bincount(x, weights, minlength) 505 if x.dtype.kind == 'f': 506 raise TypeError('x must be int array') --> 507 if (x < 0).any(): # synchronize! 508 raise ValueError('The first argument of bincount must be non-negative') 509 if weights is not None and x.shape != weights.shape:

File cupy_core\core.pyx:1173, in cupy._core.core._ndarray_base.any()

File cupy_core\core.pyx:1175, in cupy._core.core._ndarray_base.any()

File cupy_core_routines_logic.pyx:12, in cupy._core._routines_logic._ndarray_any()

File cupy_core_reduction.pyx:618, in cupy._core._reduction._SimpleReductionKernel.call()

File cupy_core_reduction.pyx:370, in cupy._core._reduction._AbstractReductionKernel._call()

File cupy_core_cub_reduction.pyx:689, in cupy._core._cub_reduction._try_to_call_cub_reduction()

File cupy_core_cub_reduction.pyx:526, in cupy._core._cub_reduction._launch_cub()

File cupy_core_cub_reduction.pyx:461, in cupy._core._cub_reduction._cub_two_pass_launch()

File cupy_util.pyx:64, in cupy._util.memoize.decorator.ret()

File cupy_core_cub_reduction.pyx:240, in cupy._core._cub_reduction._SimpleCubReductionKernel_get_cached_function()

File cupy_core_cub_reduction.pyx:223, in cupy._core._cub_reduction._create_cub_reduction_function()

File cupy_core\core.pyx:2254, in cupy._core.core.compile_with_cache()

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\cupy\cuda\compiler.py:484, in _compile_module_with_cache(source, options, arch, cache_dir, extra_source, backend, enable_cooperative_groups, name_expressions, log_stream, jitify) 480 return _compile_with_cache_hip( 481 source, options, arch, cache_dir, extra_source, backend, 482 name_expressions, log_stream, cache_in_memory) 483 else: --> 484 return _compile_with_cache_cuda( 485 source, options, arch, cache_dir, extra_source, backend, 486 enable_cooperative_groups, name_expressions, log_stream, 487 cache_in_memory, jitify)

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\cupy\cuda\compiler.py:562, in _compile_with_cache_cuda(source, options, arch, cache_dir, extra_source, backend, enable_cooperative_groups, name_expressions, log_stream, cache_in_memory, jitify) 560 if backend == 'nvrtc': 561 cu_name = '' if cache_in_memory else name + '.cu' --> 562 ptx, mapping = compile_using_nvrtc( 563 source, options, arch, cu_name, name_expressions, 564 log_stream, cache_in_memory, jitify) 565 if _is_cudadevrt_needed(options): 566 # for separate compilation 567 ls = function.LinkState()

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\cupy\cuda\compiler.py:319, in compile_using_nvrtc(source, options, arch, filename, name_expressions, log_stream, cache_in_memory, jitify) 316 with open(cu_path, 'w') as cu_file: 317 cu_file.write(source) --> 319 return _compile(source, options, cu_path, 320 name_expressions, log_stream, jitify) 321 else: 322 cu_path = '' if not jitify else filename

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\cupy\cuda\compiler.py:284, in compile_using_nvrtc.._compile(source, options, cu_path, name_expressions, log_stream, jitify) 280 def _compile( 281 source, options, cu_path, name_expressions, log_stream, jitify): 283 if jitify: --> 284 options, headers, include_names = _jitify_prep( 285 source, options, cu_path) 286 else: 287 headers = include_names = ()

File C:\ProgramData\anaconda3\envs\py4dstem1\Lib\site-packages\cupy\cuda\compiler.py:233, in _jitify_prep(source, options, cu_path) 231 if not _jitify_header_source_map_populated: 232 from cupy._core import core --> 233 jitify._init_module() 234 jitify._add_sources(core._get_header_source_map()) 235 _jitify_header_source_map_populated = True

File cupy\cuda\jitify.pyx:212, in cupy.cuda.jitify._init_module()

File cupy\cuda\jitify.pyx:233, in cupy.cuda.jitify._init_module()

File cupy\cuda\jitify.pyx:209, in cupy.cuda.jitify._init_cupy_headers()

File cupy\cuda\jitify.pyx:192, in cupy.cuda.jitify._init_cupy_headers_from_scratch()

File cupy\cuda\jitify.pyx:264, in cupy.cuda.jitify.jitify()

RuntimeError: Runtime compilation failed here.

gvarnavi commented 5 months ago

Thanks for the bug report @ZOUCHEN158. Odd, this seems to be the offending line:

if (x < 0).any(): # synchronize!

but notably it seems to crash during the evaluation of the if-statement 🤔

I suspect you might be our first user to try ptycho with cupy 13.0.0 so bear with us here while we track this down. The 13.0.0 Release Notes to mention changes in blocking/asynchronous behavior so I wonder if that's the origin.

Two quick questions for you while I find time to check this:

sezelt commented 5 months ago

Another thing to check is what version of the NVIDIA driver you have, and whether or not (a) the driver, (b) the CUDA driver, and (c) the cupy version are all compatible with one another. Especially if you installed cupy with pip, you can end up with incompatible versions that cause odd errors like this.

ZOUCHEN158 commented 5 months ago

I'm sorry, this is an awkward question. I tried to implement gpu acceleration in py4dstem version 0.14.9, but it was based on python3.10, so there were some problems when using py4dstem, so I went to adjust cupy and cuda... which caused some problems like I mentioned. When I lowered the version of py4dstem, these problems have been solved. Thanks for your patience.