rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.24k stars 884 forks source link

[FEA] Make cudftestutil a header-only package #16658

Open vyasr opened 2 weeks ago

vyasr commented 2 weeks ago

Is your feature request related to a problem? Please describe. Currently cudf publishes a cudftestutil library containing utility functions used in libcudf testing. This library was primarily intended for libcudf's own usage, but over time its usage has proliferated to many of libcudf's consumers including cuspatial and the Spark JNI plugin. Unfortunately, there are numerous issues with packaging up these utilities as a library due to the desire for consumers to be able to install this library in the same environment as different versions of gtest, or to avoid needing gtest altogether. We attempted to resolve the second issue across all of RAPIDS by statically linking gtest (see https://github.com/rapidsai/build-planning/issues/32), but there were additional complexities for cudf due to the existence of cudftestutil. Upon further experimentation, we are discovering that having a working test utility and allowing users the flexibility to install different versions of gtest are incompatible requirements due to the symbol visibility issues described here. Therefore, a new solution is needed.

Describe the solution you'd like I propose that we drop the cudftestutil compiled library altogether and convert it into a header-only library that can be used by all consumers. This would allow all necessary symbols to be compiled privately and uniquely into each test executable such that there is no concern with conflicting symbols while also allowing users to bring their own preferred version of gtest. As long as the utilities are API-compatible with the desired versions of gtest, ABI-compatibility concerns would no longer be in play, nor would symbol visibility since there would not be any DSO (cudftestutil or gtest) to speak of. Note that this assumes that test executables produced by RAPIDS libraries like also continue to statically link gtest so that the resulting test packages could be safely installed into user environments with a different version of gtest, but we have no reason to change that behavior.

Describe alternatives you've considered We've tried a number of alternatives, such as changing cudftestutil from a static library to a shared one so that we could compile in the necessary components of a static gtest, but none of these worked. A header-only library seems like the best remaining option aside from removing the utils altogether.

robertmaynard commented 2 weeks ago

for cudf itself we made cudftestutil a library to help compile speeds. We can always create a cudftestutil_internal target that is an object library that each test uses to not impact compile times.