oneapi-src / SYCLomatic

Other
220 stars 90 forks source link

fatal error: 'fmt/core.h' file not found #2047

Open ssheorey opened 2 months ago

ssheorey commented 2 months ago

Describe the bug

Errors while trying to convert CUDA code from Open3D to SYCL. Happens for any CUDA file. The errors are about unknown compiler options (warning control options that clang and g++ both recognize). There is also a fatal error about a header file whole directory is given to the compiler with -isystem. [See details below]

Please let me know if this needs specific configuration to work.

To reproduce

Clone Open3D and create compilation database for c2s:

git clone --depth 1 https://github.com/isl-org/Open3D.git
mkdir build && cd build
export CUDACXX=<path/to/nvcc>
cmake -DBUILD_CUDA_MODULE=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..

This creates compile_commands.json compilation database. Next try to use this to convert any CUDA file to SYCL:

cd ..   # in Open3D folder
c2s --compilation-database=build --analysis-mode  --comments --in-root $PWD  cpp/open3d/core/kernel/BinaryEWCUDA.cu

Errors:

warning: unknown warning option '-Wall,-Wextra,-Werror,-Wno-unused-parameter' [-Wunknown-warning-option]
error: unknown argument: '-forward-unknown-to-host-compiler'
error: unknown argument: '--generate-code=arch=compute_52,code=[compute_52,sm_52]'
error: unknown argument: '-Xcompiler=-fPIC'
error: unknown argument: '--Werror'
error: unknown argument: '--Werror'
error: unknown argument: '--Werror'
error: unknown argument: '--expt-relaxed-constexpr'
error: unknown argument: '--diag-suppress'
error: unknown argument: '--Werror'
error: unknown argument: '--expt-extended-lambda'
error: no such file or directory: 'cross-execution-space-call,deprecated-declarations'
error: no such file or directory: 'all-warnings'
error: no such file or directory: 'ext-lambda-captures-this'
error: no such file or directory: '2809'
error: no such file or directory: 'reorder'
Parsing: /home/ssheorey/Documents/Open3D/cpp/open3d/core/kernel/BinaryEWCUDA.cu
In file included from /home/ssheorey/Documents/Open3D/cpp/open3d/core/kernel/BinaryEWCUDA.cu:8:
In file included from /home/ssheorey/Documents/Open3D/cpp/open3d/core/CUDAUtils.h:17:
/home/ssheorey/Documents/Open3D/cpp/open3d/utility/Logging.h:21:10: fatal error: 'fmt/core.h' file not found
   21 | #include <fmt/core.h>
      |          ^~~~~~~~~~~~
Analyzing: /home/ssheorey/Documents/Open3D/cpp/open3d/core/kernel/BinaryEWCUDA.cu
Migrating: /home/ssheorey/Documents/Open3D/cpp/open3d/core/kernel/BinaryEWCUDA.cu
Total Project:
  +  0 lines of code (  0%) will be automatically migrated.
    -  0 APIs/Types - No manual effort.
    -  0 APIs/Types - Low manual effort for checking and code fixing.
    -  0 APIs/Types - Medium manual effort for code fixing.
  +  0 lines of code (  0%) will not be automatically migrated.
    -  0 APIs/Types - High manual effort for code fixing.
See https://www.intel.com/content/www/us/en/docs/dpcpp-compatibility-tool/developer-guide-reference/current/overview.html for more details.

Environment

OS: Ubuntu 22.04

c2s --version

Intel(R) DPC++ Compatibility Tool version 2024.1.0. Codebase:(378716e1845db56c204462764cd5fa166928ba78)

clang --version

Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

c++ --version

c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

sycl-ls --verbose


[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.17.3.0.08_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [23.52.28202.52]
[opencl:gpu:3] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [23.52.28202.52]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.28202]
[ext_oneapi_level_zero:gpu:1] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.28202]

Platforms: 4
Platform [#1]:
    Version  : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
    Name     : Intel(R) FPGA Emulation Platform for OpenCL(TM)
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type       : acc
        Version    : OpenCL 1.2 
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2024.17.3.0.08_160000
        Aspects    : accelerator fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_atomic_host_allocations usm_atomic_shared_allocations ext_oneapi_srgb ext_oneapi_non_uniform_groups
        info::device::sub_group_sizes: 4 8 16 32 64
Platform [#2]:
    Version  : OpenCL 3.0 LINUX
    Name     : Intel(R) OpenCL
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type       : cpu
        Version    : OpenCL 3.0 (Build 0)
        Name       : Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
        Vendor     : Intel(R) Corporation
        Driver     : 2024.17.3.0.08_160000
        Aspects    : cpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_oneapi_srgb ext_oneapi_native_assert ext_intel_legacy_image ext_oneapi_non_uniform_groups
        info::device::sub_group_sizes: 4 8 16 32 64
Platform [#3]:
    Version  : OpenCL 3.0 
    Name     : Intel(R) OpenCL Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 2
        Device [#2]:
        Type       : gpu
        Version    : OpenCL 3.0 NEO 
        Name       : Intel(R) Arc(TM) A770 Graphics
        Vendor     : Intel(R) Corporation
        Driver     : 23.52.28202.52
        Aspects    : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_non_uniform_groups
        info::device::sub_group_sizes: 8 16 32
        Device [#3]:
        Type       : gpu
        Version    : OpenCL 3.0 NEO 
        Name       : Intel(R) Arc(TM) A770 Graphics
        Vendor     : Intel(R) Corporation
        Driver     : 23.52.28202.52
        Aspects    : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_non_uniform_groups
        info::device::sub_group_sizes: 8 16 32
Platform [#4]:
    Version  : 1.3
    Name     : Intel(R) Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 2
        Device [#0]:
        Type       : gpu
        Version    : 1.3
        Name       : Intel(R) Arc(TM) A770 Graphics
        Vendor     : Intel(R) Corporation
        Driver     : 1.3.28202
        Aspects    : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_intel_esimd ext_oneapi_non_uniform_groups
        info::device::sub_group_sizes: 8 16 32
        Device [#1]:
        Type       : gpu
        Version    : 1.3
        Name       : Intel(R) Arc(TM) A770 Graphics
        Vendor     : Intel(R) Corporation
        Driver     : 1.3.28202
        Aspects    : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_intel_esimd ext_oneapi_non_uniform_groups
        info::device::sub_group_sizes: 8 16 32
default_selector()      : gpu, Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.28202]
accelerator_selector()  : acc, Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.17.3.0.08_160000]
cpu_selector()          : cpu, Intel(R) OpenCL, Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
gpu_selector()          : gpu, Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.28202]
custom_selector(gpu)    : gpu, Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.28202]
custom_selector(cpu)    : cpu, Intel(R) OpenCL, Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
custom_selector(acc)    : acc, Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.17.3.0.08_160000]

### Additional context

_No response_
tomflinda commented 2 months ago

@ssheorey, you should use the intercept-build tool to generate the compilation database "compile_commands.json" instead of the one generated by cmake configure command native. The intercept-build tool will filter out all the options that are specific to the nvcc compiler from the file "compile_commands.json", while the latter does not. Here is the migration steps advised.

git clone --depth 1 https://github.com/isl-org/Open3D.git
mkdir build && cd build
export CUDACXX=<path/to/nvcc>
cmake -DBUILD_CUDA_MODULE=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
make # you shoud make build process run successfully
rm compile_commands.json
intercept-build make -B ## to generate compile_commands.json
dpct -in-root= ../ -out-root=sycl_out -p ./  --use-experimental-features=all