Cannot build Open3d on jetson orin agx with support for cuda and gui

enddrone commented 9 months ago

Checklist

[X] I have searched for similar issues.
[X] For Python issues, I have tested with the latest development wheel.
[X] I have checked the release documentation and the latest documentation (for main branch).

Steps to reproduce the issue

I first cloned Open3D by:

git clone https://github.com/isl-org/Open3D.git
cd Open3D

Try to build locally on a jetson orin with NVIDIA Tegra platform(R36 , REVISION: 2.0) more system and build specs given below Follow the documentation for build with cuda support and gui support https://www.open3d.org/docs/release/arm.html#:~:text=steps%20as%20above.-,Building%20Open3D%20directly,-%23 ensure -> cmake version 3.22.1 same for conda env, python venv or default user

mkdir build
cd build
cmake -DBUILD_CUDA_MODULE=ON -DBUILD_GUI=ON ..
make -j$(nproc)

Error message

first error to which i did some hacky fix and it worked this was doing undef of global var error /build/filament/src/ext_filament/libs/image/src/ImageSampler.cpp:41:17: error: expected unqualified-id 1556.7 constexpr float M_PIf = float(filament::math::F_PI); 1556.7 ^ 1556.7 /usr/include/math.h:1168:17: note: expanded from macro 'M_PIf' 1556.7 # define M_PIf 3.14159265358979323846f /* pi */ 1556.7 ^ 1556.7 1 error generated.

now when the build proceeds i run into another issue where cuda is throwing some shared memory dynammic init error /home/orion/Open3D/cpp/open3d/core/nns/kernel/L2Select.cuh(57): error #20054-D: dynamic initialization is not supported for a function-scope static __shared__ variable within a __device__/__global__ function Pair<T, int> blockMin[kRowsPerBlock * (kBlockSize / kWarpSize)]; ^ detected during: instantiation of "void open3d::core::nns::l2SelectMin1<T,TIndex,kRowsPerBlock,kBlockSize>(T *, T *, T *, TIndex *, int, int) [with T=float, TIndex=int32_t, kRowsPerBlock=8, kBlockSize=256]" at line 223 instantiation of "void open3d::core::nns::runL2SelectMin<T,TIndex>(cudaStream_t, open3d::core::Tensor &, open3d::core::Tensor &, open3d::core::Tensor &, open3d::core::Tensor &, int, int, int) [with T=float, TIndex=int32_t]" at line 191 of /home/orion/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu instantiation of "void open3d::core::nns::KnnSearchCUDAOptimized<T,TIndex,OUTPUT_ALLOCATOR>(const open3d::core::Tensor &, const open3d::core::Tensor &, int, OUTPUT_ALLOCATOR &, open3d::core::Tensor &) [with T=float, TIndex=int32_t, OUTPUT_ALLOCATOR=open3d::core::nns::NeighborSearchAllocator<float, int32_t>]" at line 260 of /home/orion/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu instantiation of "void open3d::core::nns::KnnSearchCUDA<T,TIndex>(const open3d::core::Tensor &, const open3d::core::Tensor &, const open3d::core::Tensor &, const open3d::core::Tensor &, int, open3d::core::Tensor &, open3d::core::Tensor &, open3d::core::Tensor &) [with T=float, TIndex=int32_t]" at line 318 of /home/orion/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu im not sure on these memory management of cuda. Can someone help


### Open3D, Python and System information

```markdown
- Operating system: Ubuntu 22.04.3 LTS 
- Python version: Python 3.10.12 
- Open3D version: 0.18.0
- System architecture: jetson
- Is this a remote workstation?: no
- How did you install Open3D?: build from source
- Compiler version (if built from source): gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 / Ubuntu clang version 14.0.0-1ubuntu1.1

Additional information

No response

jjdengjj commented 8 months ago

There is an issue caused by ./build/filament/src/ext_filament/libs/image/src/ImageSampler.cpp as the const variable M_PIf has been defined in system math library. A quick fix is to rename all M_PIf in ImageSampler.cpp to M_PI_f. The new version of filament had it fixed (I only checked the code), but the open3d is still pulling the old version.

However, I also had the same issue where cuda is throwing some shared memory dynammic init error as @enddrone mentioned above. PS. This was on the Jetson Orin Nano device.

/home/jjdengjj/Open3D/cpp/open3d/core/nns/kernel/L2Select.cuh(57): error #20054-D: dynamic initialization is not supported for a function-scope static __shared__ variable within a __device__/__global__ function
                Pair<T, int> blockMin[kRowsPerBlock * (kBlockSize / kWarpSize)];
                             ^
          detected during:
            instantiation of "void open3d::core::nns::l2SelectMin1<T,TIndex,kRowsPerBlock,kBlockSize>(T *, T *, T *, TIndex *, int, int) [with T=float, TIndex=int32_t, kRowsPerBlock=8, kBlockSize=256]" at line 223
            instantiation of "void open3d::core::nns::runL2SelectMin<T,TIndex>(cudaStream_t, open3d::core::Tensor &, open3d::core::Tensor &, open3d::core::Tensor &, open3d::core::Tensor &, int, int, int) [with T=float, TIndex=int32_t]" at line 191 of /home/jjdengjj/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu
            instantiation of "void open3d::core::nns::KnnSearchCUDAOptimized<T,TIndex,OUTPUT_ALLOCATOR>(const open3d::core::Tensor &, const open3d::core::Tensor &, int, OUTPUT_ALLOCATOR &, open3d::core::Tensor &) [with T=float, TIndex=int32_t, OUTPUT_ALLOCATOR=open3d::core::nns::NeighborSearchAllocator<float, int32_t>]" at line 260 of /home/jjdengjj/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu
            instantiation of "void open3d::core::nns::KnnSearchCUDA<T,TIndex>(const open3d::core::Tensor &, const open3d::core::Tensor &, const open3d::core::Tensor &, const open3d::core::Tensor &, int, open3d::core::Tensor &, open3d::core::Tensor &, open3d::core::Tensor &) [with T=float, TIndex=int32_t]" at line 318 of /home/jjdengjj/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

1 error detected in the compilation of "/home/jjdengjj/Open3D/cpp/open3d/core/nns/KnnSearchOps.cu".
make[2]: *** [cpp/open3d/core/CMakeFiles/core.dir/build.make:1280: cpp/open3d/core/CMakeFiles/core.dir/nns/KnnSearchOps.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2278: cpp/open3d/core/CMakeFiles/core.dir/all] Error 2
make: *** [Makefile:156: all] Error 2

batzor commented 7 months ago

Same issue here with AGX Orin. In case anyone is wondering, the filament issue was fixed in this PR https://github.com/google/filament/pull/5774/files so you can workaround it by manually editing the build source.

kshitijgoel007 commented 6 months ago

I followed this commit from the cudf repository https://github.com/rapidsai/cudf/pull/14108/commits/3676839700905c13c7a837e51b5ad7bfafc4b225

And made this change on my fork of Open3D: https://github.com/rislab/Open3D/commit/8f000802ebf527947ab53d8bb25d30a5cd3d1c52

Tested on Orin Nano, Orin NX, and Orin AGX. All good so far.

Interestingly, I did not run into the issue with filament referenced above on either of Orin AGX/NX/Nano.

Ubuntu 20.04/CUDA Toolkit 12.2/GCC 9.4.0/CMake 3.92.2

isl-org / Open3D