Closed CreRecombinase closed 3 years ago
Thanks for the suggestion and your interest in shrinkwrap, but I'm quite happy with cget and find it to be more versatile than FetchContent.
One of the benefits of cget is that it doesn't actually force external users to use it. If you prefer using FetchContent and want to build an executable that uses shrinkwrap, then you can fetch shrinkwrap and its dependencies with FetchContent in your executable's CMakeLists.txt file. Alternatively, you could use apt, yum, conda, or any other package manager to install the dependencies. As long as the lib and include paths are setup correctly, shrinkwrap doesn't care about where or how its dependencies are installed.
Thanks for the prompt response. I can certainly appreciate the flexibility offered by cget. I'll confess that my ultimate goal with this issue was actually to eventually get savvy to vendor shrinkwrap using FetchContent, with shrinkwrap pulling in zstd etc. This would mean a small increase in compile time in exchange for:
cmake -S . -B build && cmake --build build
, just like every cmake-based package. I think it's hard to overstate the importance making installation easy when trying to drive adoption (I would love for savvy to displace bcftools). You can save folks a lot of frustration by removing (seemingly) small hurdles (especially when it comes to bioinformatics software). I would argue that even if cget doesn't force external users to use it, it does force them to do use something to resolve missing dependencies. It's not always straightforward to work out what's gone wrong/what you need to install when cmake can't find a library/package.
I hope I don't come across as stubborn on this. I did consider today the possibility of supporting FetchContent as an optional alternative to cget, but I'm reluctant to maintain support for two dependency management systems simultaneously and FetchContent simply doesn't work the way I prefer. With that said, I don't want my preference on the matter to hinder your preference to use FetchContent with savvy. I put together a test case and provided a suggestion at the bottom of this (very long) response that I believe should satisfy your needs. So feel free to jump to the end if you don't feel like reading my rant :).
I believe the biggest drawback/hurdle to using cget is that it requires python and pip to be installed. Though this is a legitimate drawback, the FetchContent approach has a similar (and more prevalent) hurdle of requiring CMake v3.14.5 to be installed. Users would likely need to compile CMake from source to get a version >= 3.14.5 unless their OS version is very recent (e.g., Ubuntu 20.04 https://packages.ubuntu.com/search?keywords=cmake). Building CMake is fairly straightforward but takes about 15 minutes to build all of its components and requires users to find/download the source code. It only takes a few seconds to install cget with pip3 install --user cget
.
I agree that it is especially important to simplify the install process for bioinformatics tools, and I've tried to make it as simple as possible for novice users to install the SAV CLI. Many of these novice users have never used CMake before and are unfamiliar with common conventions. So whether they run cget or run CMake directly, there is still going to be a learning curve for them. SAV CLI install options for novice users include a one-liner to install from source (cget install statgen/savvy
) and multiple options for installing statically-linked, prebuilt binaries (sudo apt install savvy-2.0.1-Linux-cli.deb
, sudo yum install savvy-2.0.1-Linux-cli.rpm
, or bash savvy-2.0.1-Linux-cli.sh --prefix=<install_prefix>
).
For power users like you, cget does offer control over how dependencies are built. I don't know the exact options for LTO, but it would look something like cget init -DINTERPROCEDURAL_OPTIMIZATION=TRUE && cget install statgen/savvy
or cget init -DCMAKE_CXX_FLAGS="-flto" -DCMAKE_EXE_LINKER_FLAGS="-flto" && cget install statgen/savvy
. You could also add -D
options for each dependency in a requirements.txt file.
Users who want to develop tools using savvy can simply configure their projects with -DCMAKE_TOOLCHAIN_FILE=../cget/cget/cget.cmake
and use find_package(savvy REQUIRED)
and target_link_libraries(<your_exe> savvy)
to link savvy and all of its dependencies. Side note: the CLion IDE can also be set up to configure CMAKE_TOOLCHAIN_FILE
.
My main issue with FetchContent is that it uses the superbuild pattern via add_subdirectory()
, which results in the global namespace being contaminated with the CMake settings of external dependencies. Running make install
when using FetchContent results in installing not only your current project's targets but also the targets of all of your dependencies. I don't want zlib (for example) to be installed when a user runs make install
on an executable project that uses savvy. Instead I would want only the statically-linked executable to be installed. A similar annoyance is that make clean
will clean all of the dependencies. When developing a tool that uses savvy, it would be aggravating to wait for all of your third-party libraries to rebuild every time you clean your project.
I do recognize the allure of a CMake-only solution and wish CMake provided an alternative to FetchContent/ExternalProject that utilized find_package. I once considered reimplementing cget as a CMake module (similar to CPM), but I don't have much time for side projects.
My recommendation is to put the following snippet in a file called FetchSavvyDependency.cmake and put include(FetchSavvyDependency.cmake)
in CMakeLists.txt.
if (CMAKE_VERSION VERSION_GREATER 3.14.4 AND <PROJ_NAME>_FETCH_DEPENDENCIES)
# See https://gitlab.kitware.com/cmake/cmake/-/issues/17735 for discussion on why ignore_package is necessary
function(ignore_package NAME)
file(WRITE ${CMAKE_BINARY_DIR}/${NAME}/${NAME}Config.cmake "")
set(${NAME}_DIR ${CMAKE_BINARY_DIR}/${NAME} CACHE PATH "")
endfunction()
ignore_package(shrinkwrap)
set(shrinkwrap_DIR "${CMAKE_BINARY_DIR}/shrinkwrap")
include(FetchContent)
set(CMAKE_POLICY_DEFAULT_CMP0048 NEW)
set(CMAKE_POLICY_DEFAULT_CMP0076 OLD)
set(ZSTD_BUILD_PROGRAMS OFF)
set(ZSTD_BUILD_SHARED OFF)
FetchContent_Declare(
zstd_proj
GIT_REPOSITORY https://github.com/facebook/zstd.git
GIT_TAG v1.4.8
SOURCE_SUBDIR "build/cmake")
FetchContent_Declare(
zlib_proj
URL http://zlib.net/zlib-1.2.11.tar.gz)
FetchContent_Declare(
shrinkwrap_proj
GIT_REPOSITORY https://github.com/jonathonl/shrinkwrap
GIT_TAG 420eb397e31dedc17cae9eecdf33e8fb10e05497)
FetchContent_Declare(
savvy_proj
GIT_REPOSITORY https://github.com/statgen/savvy
GIT_TAG 685748d3feafa33cbc9ee3c771b53ea30991e546)
FetchContent_MakeAvailable(zlib_proj zstd_proj shrinkwrap_proj savvy_proj)
set_target_properties(minigzip minigzip64 example example64
PROPERTIES EXCLUDE_FROM_ALL TRUE)
endif()
Thanks for the thorough response. You've clearly thought this through and you make very compelling arguments. I'll close the issue (and try out your proposed solution) but before that I wanted to mention a few things:
EXCLUDE_FROM_ALL
took care of the make install
issue.
Hi, this is a really great library. I wonder how amenable you'd be to switching from cget to FetchContent as a way of obtaining the various compression libraries. It would require bumping the minimum CMake version a little bit, but I think it would make this library a lot easier to use.
I'd be happy to make a pull request if this is appealing