isl-org / Open3D

Open3D: A Modern Library for 3D Data Processing
http://www.open3d.org
Other
11.53k stars 2.32k forks source link

CUDA runtime error - open3D v0.14.1 #4679

Closed BBO-repo closed 2 years ago

BBO-repo commented 2 years ago

Checklist

Describe the issue

With the following configuration:

I got a runtime error, running the DenseSlam.cpp when I set "--device CUDA:0" but everything works fine when I use "--device CPU:0"

The error is the following [Open3D INFO] Using device: CUDA:0 terminate called after throwing an instance of 'std::runtime_error' what(): [Open3D Error] (void open3d::core::__OPEN3D_CUDA_CHECK(cudaError_t, const char*, int)) /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/CUDAUtils.cpp:301: /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp:43 CUDA runtime error: operation not supported

I do not understand what is the issue, since testing my cuda install I've run the deviceQuery cuda application which outputs me the following bin/x86_64/linux/release/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Quadro M1000M" CUDA Driver Version / Runtime Version 11.6 / 11.6 .... deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1 Result = PASS

Could you please provide any support to solve my issue?

Also joining my cmake file to build open3D

# Option 1: Use ExternalProject_Add, as shown in this CMake example.
# Option 2: Install Open3D first and use find_package, see
#           http://www.open3d.org/docs/release/cpp_project.html for details.
include(ExternalProject)
ExternalProject_Add(
    external_open3d
    PREFIX open3d
    GIT_REPOSITORY https://github.com/intel-isl/Open3D.git
    GIT_TAG v0.14.1
    GIT_SHALLOW ON
    UPDATE_COMMAND ""
    # Check out https://github.com/intel-isl/Open3D/blob/master/CMakeLists.txt
    # For the full list of available options.
    CMAKE_ARGS
        -DCMAKE_INSTALL_PREFIX=<INSTALL_DIR>
        -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
        -DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}
        -DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}
        -DGLIBCXX_USE_CXX11_ABI=${GLIBCXX_USE_CXX11_ABI}
        -DSTATIC_WINDOWS_RUNTIME=${STATIC_WINDOWS_RUNTIME}
        -DBUILD_SHARED_LIBS=ON
        -DBUILD_PYTHON_MODULE=OFF
        -DBUILD_EXAMPLES=OFF
        -DBUILD_WEBRTC=OFF
        -DBUILD_CUDA_MODULE=ON
)

Steps to reproduce the bug

In a ubuntu 18.04.6 linux distribution with a machine supporting cuda
Install cuda 11.6
Use open3D external cmake add external 
Run the example DenseSLAM with flag "--device CUDA:0"

Error message

[Open3D INFO] Using device: CUDA:0 terminate called after throwing an instance of 'std::runtime_error' what(): [Open3D Error] (void open3d::core::__OPEN3D_CUDA_CHECK(cudaError_t, const char*, int)) /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/CUDAUtils.cpp:301: /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp:43 CUDA runtime error: operation not supported

Expected behavior

Running DenseSLAM without crashing it is the case when running with the flag "--device CPU:0"

Open3D, Python and System information

- Operating system: Ubuntu 18.04.6
- Open3D version: 0.14.1
- System type: 64 bit machine
- Is this remote workstation?: no
- How did you install Open3D?: build from source
- Compiler version (if built from source): gcc 7.5

Additional information

image

theNded commented 2 years ago

Quadro M1000M is an old card and I suspect cudaMallocAsync is not supported these machines, see https://github.com/JuliaGPU/CUDA.jl/issues/637 (@yxlao we may want to add this checker in addition to the CUDART version macro).

One potential fix is to replace all the functions with Async postfix with their non-async versions.

BBO-repo commented 2 years ago

Hi @theNded Thank you for you fast answer. You were right the Async was making the issue. I've changed in the file open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp, by commenting the lines 41 to 44 to disable the cudaMallocAsync and the lines 58 to 62 to disable the cudaFreeAsync

The denseSlam is now running until a point where I do face another error: an out of memory error

[Open3D INFO] Processing 925/2407...
[Open3D INFO] Processing 926/2407...
terminate called after throwing an instance of 'std::runtime_error'
  what():  [Open3D Error] (void open3d::core::__OPEN3D_CUDA_CHECK(cudaError_t, const char*, int)) /home/ubuntu/Work/Projects/handheld-scanning-prototype/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/CUDAUtils.cpp:301: /home/ubuntu/Work/Projects/Open3DBox/build/open3d/src/external_open3d/cpp/open3d/core/MemoryManagerCUDA.cpp:45 CUDA runtime error: out of memory```

It always failed to this 926th RGB-D image. Do you have any idea to solve this?

theNded commented 2 years ago

Quadro M1000M only has 4G GPU memory, so there are not many things we can do with it. One potential change is to increase the voxel size by a factor of say 2, but it will sacrifice the tracking and reconstruction quality.

BBO-repo commented 2 years ago

Ok then it is all solved! Thank you.

ao2 commented 1 year ago

Hi,

@theNded I would like to re-open this issue as there are new findings about it.

To recap the CUDA runtime error: operation not supported error referred to the fact that the cudaMallocAsync() function does not work on some GPUs, namely Quadro M1000M and Quadro M3000M (the one I have).

Digging in the CUDA documentation we can find out that the Stream Ordered Memory Allocator is not available on all NVIDIA GPUs and that support should be verified at runtime, see https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__POOLS.html

In practical terms this would mean that the compile-time check on the driver version performed in https://github.com/isl-org/Open3D/blob/59792c226f358131eedb51658ae01de24dc50c5a/cpp/open3d/core/MemoryManagerCUDA.cpp#L22-L27 should be replaced with something like:

    if (device.cudaSupportsMemoryPools()) {
         OPEN3D_CUDA_CHECK(cudaMallocAsync(static_cast<void**>(&ptr), byte_size,
                                          cuda::GetStream()));
    } else {
        OPEN3D_CUDA_CHECK(cudaMalloc(static_cast<void**>(&ptr), byte_size));
    }

and the implementation of device.cudaSupportsMemoryPools() could be something like this:

    int driverVersion = 0;
    int deviceSupportsMemoryPools = 0;
    OPEN3D_CUDA_CHECK(cudaDriverGetVersion(&driverVersion));
    if (driverVersion >= 11020) { // avoid invalid value error in cudaDeviceGetAttribute
        OPEN3D_CUDA_CHECK(cudaDeviceGetAttribute(&deviceSupportsMemoryPools, cudaDevAttrMemoryPoolsSupported, device));
    }

    return !!deviceSupportsMemoryPools;

I'll try to propose a patch for this, but if someone more familiar with the Open3D codebase wants to anticipate me, please go ahead.

Thank you, Antonio

ao2 commented 1 year ago

Pushed a tentative fix to https://github.com/isl-org/Open3D/pull/6440