NVIDIA / thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
Other
4.91k stars 758 forks source link

Support adding Thrust to CMake projects with `add_subdirectory` #976

Closed Char-Aznable closed 4 years ago

Char-Aznable commented 5 years ago

I have been using the github thrust for a while before the recent merge with cuda one. I manage dependency using cmake fetchcontent. After merging with the recent update in this repo, which includes a CMakeLists.txt, I found that it breaks command like add_subdirectory() because of the following error:

 MSVC_COMPILER_FLAGS:                                                                                                                                           
 | WARN_ALL : '/Wall'                                                                                                                                          
 | WARNINGS_AS_ERRORS : '/Wx'                                                                                                                                  
 | RELEASE : '/Ox'                                                                                                                                             
 | DEBUG : '/Zi;-D_DEBUG;/MTd'                                                                                                                                 
 | EXCEPTION_HANDLING : '/EHsc'                                                                                                                                
 | CPP : ''                                                                                                                                                    
 | OMP : '/openmp'                                                                                                                                             
 | TBB : ''                                                                                                                                                    
 | CUDA : ''                                                                                                                                                   
 | CUDA_BULK : ''                                                                                                                                              
 | WORKAROUNDS : '/DNOMINMAX;/wd4503'                                                                                                                          
 | C++03 : ''
 | C++11 : '-std=c++11'
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Found CUDA: /net/software/modules-sw/cuda/10.1/Linux/RHEL6/x86_64 (found version "10.1")                                                           
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Found 49 examples
-- Found 5 examples/cuda
-- Found 4 examples/cpp_integration
-- Found 152 tests in testing
-- Found 59 tests in backend
CMake Error at build_cuda/_deps/thrust-src/testing/CMakeLists.txt:48 (add_custom_target):                                                                     
  add_custom_target cannot create target "check" because another target with
  the same name already exists.  The existing target is a custom target
  created in source directory
  "/home/aznb/mycodes/SCgenome_scbmc/build_cuda/_deps/kokkos-src".                                                                         
  See documentation for policy CMP0002 for more details.

Aside from having name conflicts via add_custom_target, I was expecting using thrust as a header-only library and I don't expect cmake to config building any of the test targets unless I want it so.

Char-Aznable commented 5 years ago

cross referencing https://github.com/thrust/thrust/issues/981

griwes commented 5 years ago

The so far released versions contain a build system that is currently unsupported; the next Toolkit release should contain the new build system that is already available in master (and the maintenance branches here on Github).

Why are you doing add_subdirectory on Thrust? As you said, Thrust is a header only library, so you should be able to just point your compiler at the headers and be done with it. So instead of something like

add_subdirectory(${THRUST_DIRECTORY})

you should be doing

include_directories(SYSTEM ${THRUST_DIRECTORY})
Char-Aznable commented 5 years ago

@griwes Because add_subdirectory is the more systematic way of handling dependency (and it doesn't matter whether the dependee is header-only or not) and it's what cmake does if the user is adding thrust to the build tree through cmake's fetchcontent module. It would be too painful if the user has a list of libraries in cmake where they have to manually set the thrust path and do what you suggest.

Also, what you suggest breaks the transitivity of dependency, e.g., thrust-->A-->B-->... where the user has to manually do include_directories for A and B and anything downstream, which is unmanageable for large project.

griwes commented 5 years ago

I don't follow.

  1. How is it more systematic?
  2. Scanning through the FetchContent module documentation, I could find no mentions of it doing add_subdirectory itself; as far as I can tell from reading it, all the examples and the prose talk about the user first calling one of its functions, and then doing add_subdirectory manually. If I missed something there, can you point me to where it says it does what you said it does?

The point about transitivity does seem fair to me, but add_subdirectory still reads as the wrong primitive to use here for me. Something like find_package seems much more reasonable.

henryiii commented 5 years ago

This is a very common method, and works well in several cases, such as if you use git submodules to manage dependencies. Supporting it is not too hard, though it does take some care in setup. The problem with something like find_package is that it requires "installs", and it tends to touch things in a user's home directory.

Since thrust is header-only, you can currently add this without using add_subdirectory (that's what I currently do) in a few lines, but it's much nicer and more consistent to do this will all dependencies. Many libraries now support that, like PyBind11.

I would second making tests only build if this is the main project.

griwes commented 5 years ago

@henryiii just to make sure I understand PyBind11's CMakeLists - the include directories attached to the interface library will be propagated to any library that "links" it? If that is so, I think we could do that. (The rest of this comment hinges on whether my understanding there is correct, regarding include paths, other compiler flags, and linker flags being attachable to that target.)

(NB: this doesn't change my opinion that this feels like the wrong primitive for this purpose. The CMake community may disagree here, but if that is indeed the mechanism at work here, I'm rather concerned about the direction the CMake community is taking with the project and the recommended practices for it.)

Nevertheless, I think that in such a situation I'd make tests and examples not optional, but not present at all, because I'd prefer to not have to think about name clashes with targets et al. Essentially, I'd insert a conditional add_library(Thrust INTERFACE) and the setup of that, followed by a return(), somewhere around line 65 (again, looking at the CMakeLists.txt on the master branch!); effectively making sure that:

  1. The interface library has the include directories attached.
  2. The interface library has the defines to select the backend attached.
  3. The Thrust version string is set, now probably with PARENT_SCOPE.
  4. (This would need to go up:) TBB or OMP, if their backend is selected, are find_package'd, and the relevant flags are added to the interface library.
  5. No other options or compiler flags are set, CUDA support is not enabled, the selection of CUDA compute capabilities to compile for is not run, the test and example targets and all that jazz is not included at all.

Does this sound sensible to everyone?

henryiii commented 5 years ago

@griwes, yes, that is correct.

The general idea is that if you are careful, you can provide the exact same interface whether you install, just build, or use add_subdirectory. The first requires creating a Config file, the second works because CMake registers built packages in a .cmake directory in your home directory, and the third allows a package to include the source as a submodule or using one of the download methods (or just bundling everything together). The third option has become very popular with header only and quickly compiling libraries; since it takes the burden of installing and managing packages off of the end-user.

To do this, you should provide "Thrust::Thrust" or something similar as the external target. The namespace can be added by the target export; and for the add_subdirectory method to match, you can set up an alias - a user should not do anything different for find_package or add_subdirectory. For multiple backends, you could have explicit targets, like "Thrust::OpenMP". The tests, examples, etc. should be protected so they are not created if this is a subdirectory.

An example of a library that does this already is VexCL: https://github.com/ddemidov/vexcl/blob/a0fe836ee70280b697bbabcaee7edd2fba8faac0/CMakeLists.txt#L189 .

Note that this is using the old find_package method for the CUDA target. To see an example of what this looks like in practice in one of my projects:

https://github.com/GooFit/GooFit/blob/e52fb3956c208fa4a6eaccd5802f4df7fb4db322/CMakeLists.txt#L488-L506

Note that fmtlib has added the more canonical fmt::fmt target now, but I apparently never fixed my code, but you get the general idea.

So, in answer to your suggestion:

  1. Yes
  2. I think you'd have a "default" or "best", a "no-defines" version, and then targets like Thrust::TBB for each discovered backend. Same for Config version.
  3. Haven't thought much about version strings, since the user usually knows what version if he's adding it as a directory, but probably. Maybe with global scope.
  4. See 2. Yes.
  5. Why not CUDA? CUDA now has first-class language support in CMake. I woudn't do anything too fancy, like setting arch flags and things, that should be done in the master project, but a CUDA target should be available if CUDA is available.
griwes commented 5 years ago

Not CUDA, because we do a lot of Thrust specific CUDA configuration, and I don't want the "proper" build system to be entangled with interface targets that we'd export. For instance, I imagine that not everyone wants to force the CUDA host compiler to be the exact same compiler as the C++ compiler ;) I guess I could put enable_language(CUDA) in the section that gets executed when importing the target, but I'd be hesitant to do anything more there, and even just that makes it slightly harder to keep the different portions of the CMakeLists separate (since we explicitly set CMAKE_CUDA_HOST_COMPILER before enabling CUDA, as is necessary; see https://github.com/thrust/thrust/blob/master/CMakeLists.txt#L72-L80).

I would not export separate libraries for different targets. This is for at least three reasons:

  1. People often have a misconception that they can link translation units that link Thrust that uses different backends; this is not the case, and any such program, if it works, works purely by accident. Having separate libraries would, in my opinion, encourage such misuse of the library (or at least make it easier, potentially by accident?). Having to specify the host and device backends as a CMake variable before including Thrust in the build system would then force this selection of backends on the user under the single Thrust library target, which should reduce this particular problem in users who would otherwise encounter it.
  2. TBB and OMP would do find_package to find, respectively, TBB and OMP; I don't want to run through that if the user doesn't request those backends. It's harder to provide a good failure mode there. I imagine I could just not provide Thrust::TBB or Thrust::OMP if the find_package fails, but... that seems bad, because the error message the user gets will be suboptimal. I'd prefer for configuration to fail when it fails to find the package when a given backend is requested.
  3. In theory (this is not actively tested at the moment, but we are planing to change that) it should be possible to select any of {CPP,TBB,OMP} as the host system, and any of {CPP,TBB,OMP,CUDA} as the device system. This gets us a total of 12 different targets that would need to be provided (admittedly some rather pointless), and while this is indeed something that could just be a couple of nested loops, this doesn't seem like a good interface to me (especially given (1)).

I realize that this runs afoul of potential use-cases where people could have different parts of their project, built by a single build system, correctly use Thrust with different backends (i.e. never linking them together, or making sure that all Thrust symbols are hidden, etc.) - but since this sounds like a more expert thing to do to me, I think the added difficulty with shooting yourself in the foot by silently violating ODR is not entirely unwelcome.

I'm not sure why I'd want to specify "no defines". It'd essentially be equivalent to "with defines, for host = CPP and device = CUDA", which would be the default anyway.

I'm not sure I fully understand the Config file part yet; I'll try to get something along the lines of what I described above going within the next few of weeks (time permitting); hopefully then we can figure out all the details? @henryiii, since you seem to have a good grasp on this, my thinking is I'd create a PR here and ask you to review; does that sound good?

Char-Aznable commented 5 years ago

@henryiii provides a very thorough explanation here. @griwes I only want to point out what you asked before about fetchcontent and add_subdirectory -- it's described here: https://github.com/Kitware/CMake/blob/e2d0aea2c734c8c5028f3573082e75bd157dbe72/Modules/FetchContent.cmake#L43-L58

henryiii commented 5 years ago

@griwes, yes, I could do that.

henryiii commented 5 years ago

@griwes, here is a possible use that would be great to support (completely making up API here)

set(THRUST_BACKEND ALL) # Enables search for all backends

include_subdirectory(thrust)
# Could also be find_package(thrust CONFIG) just as easily

# The following could even be wrapped into a function thrust_add_executables
add_executable(my_prog_omp myprog.cu)
set_source_files_properties(myprog.cu PROPERTIES LANGUAGE CXX)
target_link_libraries(my_prog_cuda PRIVATE Thrust::OMP)

if(TARGET Thrust::CUDA)
  add_executable(my_prog_cuda myprog.cu)
  target_link_libraries(my_prog_cuda PRIVATE Thrust::CUDA)
endif()

Now multiple backends are built, allowing easy switching/testing/comparison.

This is similar to how VecCL works, and was implemented in a proof of concept form for Hydra (which contains Thrust): https://github.com/CLIUtils/cmake/blob/master/AddHydra.cmake.

henryiii commented 5 years ago

I assume you could have Thrust::device_OMP, Thrust::host_OMP, ect. I would make "Thrust::Thrust" be just the headers and no defines (it's okay to mix targets that have the same defines, and all the other targets would link to this one anyway) (this is close to how Boost::Boost works, for example), and Thrust::default would be the backend selected if a single backend was set with set(THRUST_BACKEND OMP).

Char-Aznable commented 4 years ago

Is there any update on this?

alliepiper commented 4 years ago

I'll be addressing the issue in the title as part of the fix for #1159.

From the discussion here, it seems like you folks may be interested in the new find_package interface for Thrust. I can add some hooks to just do an appropriate find_package(Thrust) call when included with add_subdirectory, and then you will be able to use the thrust_create_target API to configure your Thrust interfaces.

I think that should fulfill the usecases described in the above discussion. Take a look at the new find_package info I linked above and let me know if there's anything missing that you'd need for your projects.

Char-Aznable commented 4 years ago

@allisonvacanti find_package doesn't solve the problem describe here because it's finding package installed in a specified path in the system. The use case here is concerned with downloading thrust via FetchContent and include it into the build tree by FetchContent_MakeAvailable (or its underlying command add_subdirectory).

I don't think I understand how #1159 is related to this because it seems to be about how to config the build for testing. A simple boolean switch is enough to exclude the tests for a build tree.

alliepiper commented 4 years ago

find_package can be used to pull in configs from a specific directory when correctly configured. We use this feature internally in Thrust's own build system to configure targets for our tests, etc:

https://github.com/thrust/thrust/blob/master/CMakeLists.txt#L88-L94

# Use our find_package config to assemble the Thrust library components we need:
find_package(Thrust REQUIRED CONFIG
  NO_DEFAULT_PATH # Only check the explicit HINTS below:
  HINTS
    "${Thrust_SOURCE_DIR}"
)
thrust_create_target(Thrust [options...])

What I propose is: when a checkout of thrust is included via add_subdirectory, our root level CMakeLists.txt will detect this, do find_package on itself, and return() -- thus exposing our configuration targets and API to the parent project.

As a concrete example of how this would work I'll modify @henryiii's example:

set(THRUST_COMPONENTS CPP OMP)
set(THRUST_OPTIONAL_COMPONENTS CUDA)

add_subdirectory(thrust)

add_executable(my_prog_omp myprog.cu)
set_source_files_properties(myprog.cu PROPERTIES LANGUAGE CXX)
thrust_create_target(ThrustOMP HOST CPP DEVICE OMP)
target_link_libraries(my_prog_cuda PRIVATE ThrustOMP)

thrust_update_system_found_flags() # required due to CMake scoping rules
if(THRUST_CUDA_FOUND)
  add_executable(my_prog_cuda myprog.cu)
  thrust_create_target(ThrustCUDA HOST CPP DEVICE CUDA)
  target_link_libraries(my_prog_cuda PRIVATE ThrustCUDA)
endif()

1159 is not directly related, but it's going to involve a significant refactoring of our build system that includes adding switches for tests, examples, and header-tests. This was just a heads up that we'll be adding the option through other work soon.

Char-Aznable commented 4 years ago

Thank you @allisonvacanti for explaining! This makes more sense now. When should we expect this to be implemented? I am looking forward to trying it out.

alliepiper commented 4 years ago

Should be ready in the next couple of weeks. I'm working on the multiconfig stuff now, once that's in good shape I'll just need to add a few bits to get add_subdirectory working.

alliepiper commented 4 years ago

1184 makes all build targets toggle-able via CMake variables:

I've updated the title of this issue to reflect the work that remains.

alliepiper commented 4 years ago

1248 adds the add_subdirectory(thrust) functionality described here with some slight changes to variable names.

See the new cmake/add_subdir example for an example of this works, and cmake/README.md for a description of how the thrust_create_target function is used.

Feel free to try it out and let me know if there are any issues.

alliepiper commented 4 years ago

Should be ready in the next couple of weeks.

Apparently a "couple of weeks" is roughly 3 months in quarantime :-)

Char-Aznable commented 4 years ago

@allisonvacanti Thanks for the great work! Both add_subdirectory and thrust_create_target work for me. What's the plan of merging this into the release? I am currently using the cuda-11.0 branch

alliepiper commented 4 years ago

Great, glad it worked for you! It will be merged into the main branch soon and should be included in Thrust 1.10.0.

alliepiper commented 4 years ago

This has been addressed by #1248.