JuliaGPU / CUDA.jl

CUDA programming in Julia.
https://juliagpu.org/cuda/
Other
1.21k stars 221 forks source link

`copyto!` between a PermutedDimsArray view and a CuArray doesn't work #1697

Closed albertomercurio closed 1 year ago

albertomercurio commented 1 year ago

To reproduce

The following code works:

using LinearAlgebra
using CUDA
using CUDA.CUSPARSE
CUDA.allowscalar(false)

A = permutedims(CUDA.rand(ComplexF64, 100, 100), [2, 1])
B = 0 .* similar(A, 80, 80)
copyto!(B, view(A, 1:80, 1:80))

But this not:

A = PermutedDimsArray(CUDA.rand(ComplexF64, 100, 100), [2, 1])
B = 0 .* similar(A, 80, 80)
copyto!(B, view(A, 1:80, 1:80))

Version info

Details on Julia:

Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 6 on 12 virtual cores
Environment:
  LD_LIBRARY_PATH = /opt/intel/oneapi/vpl/2022.2.0/lib:/opt/intel/oneapi/tbb/2021.7.0/env/../lib/intel64/gcc4.8:/opt/intel/oneapi/mpi/2021.7.0//libfabric/lib:/opt/intel/oneapi/mpi/2021.7.0//lib/release:/opt/intel/oneapi/mpi/2021.7.0//lib:/opt/intel/oneapi/mkl/2022.2.0/lib/intel64:/opt/intel/oneapi/ipp/2021.6.1/lib/intel64:/opt/intel/oneapi/ippcp/2021.6.1/lib/intel64:/opt/intel/oneapi/ipp/2021.6.1/lib/intel64:/opt/intel/oneapi/dnnl/2022.2.0/cpu_dpcpp_gpu_dpcpp/lib:/opt/intel/oneapi/debugger/2021.7.0/gdb/intel64/lib:/opt/intel/oneapi/debugger/2021.7.0/libipt/intel64/lib:/opt/intel/oneapi/debugger/2021.7.0/dep/lib:/opt/intel/oneapi/dal/2021.7.0/lib/intel64:/opt/intel/oneapi/compiler/2022.2.0/linux/lib:/opt/intel/oneapi/compiler/2022.2.0/linux/lib/x64:/opt/intel/oneapi/compiler/2022.2.0/linux/lib/oclfpga/host/linux64/lib:/opt/intel/oneapi/compiler/2022.2.0/linux/compiler/lib/intel64_lin:/opt/intel/oneapi/ccl/2021.7.0/lib/cpu_gpu_dpcpp
  JULIA_NUM_THREADS = 6

Details on CUDA:

CUDA runtime 11.8, artifact installation
CUDA driver 12.0
NVIDIA driver 526.98.0

Libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 11.0.0+525.60.2

Toolchain:
- Julia: 1.8.2
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce GTX 1650 Ti (sm_75, 3.850 GiB / 4.000 GiB available)
maleadt commented 1 year ago

copyto! is intentionally not implemented for (most) function wrappers. It's very hard to do so without ambiguities (because copyto! is also used to copy from/to CPU memory), and generally it's only intended for low-level copies that can be executed using memcpys. For more high-level copy functoinality, just use broadcast.