Open h-vetinari opened 9 months ago
CC @dvyukov as the author of https://github.com/abseil/abseil-cpp/commit/f3760b4d3b2773d1cb8e9ddda29dc9bce386d540.
Relevant context is also that we're using shared builds of abseil.
I ended up building a version of abseil (published in a separate channel) with the following patch:
diff --git a/absl/synchronization/mutex.cc b/absl/synchronization/mutex.cc
index cb3c7e74..76d561df 100644
--- a/absl/synchronization/mutex.cc
+++ b/absl/synchronization/mutex.cc
@@ -731,12 +731,10 @@ static unsigned TsanFlags(Mutex::MuHow how) {
}
#endif
-#if defined(__APPLE__) || defined(ABSL_BUILD_DLL)
// When building a dll symbol export lists may reference the destructor
// and want it to be an exported symbol rather than an inline function.
// Some apple builds also do dynamic library build but don't say it explicitly.
Mutex::~Mutex() { Dtor(); }
-#endif
#if !defined(NDEBUG) || defined(ABSL_HAVE_THREAD_SANITIZER)
void Mutex::Dtor() {
diff --git a/absl/synchronization/mutex.h b/absl/synchronization/mutex.h
index d53a22bb..2cd5a1e4 100644
--- a/absl/synchronization/mutex.h
+++ b/absl/synchronization/mutex.h
@@ -1064,11 +1064,6 @@ inline Mutex::Mutex() : mu_(0) {
inline constexpr Mutex::Mutex(absl::ConstInitType) : mu_(0) {}
-#if !defined(__APPLE__) && !defined(ABSL_BUILD_DLL)
-ABSL_ATTRIBUTE_ALWAYS_INLINE
-inline Mutex::~Mutex() { Dtor(); }
-#endif
-
#if defined(NDEBUG) && !defined(ABSL_HAVE_THREAD_SANITIZER)
// Use default (empty) destructor in release build for performance reasons.
// We need to mark both Dtor and ~Mutex as always inline for inconsistent
With this, the grpc example compiles again.
Could this simply be a case of the shared library case on linux not being covered correctly?
@derekmauro @dvyukov, could you comment about the (un-)inlining of the Mutex destructor, or alternatively, chime in on the linked grpc issue if their usage of the Mutex is somehow not supported?
Do you use bazel or cmake build?
I am not an expert on all open-source absl build modes. Is it possible to add a test for this?
I see this code does something similar: https://github.com/abseil/abseil-cpp/blob/14b8a4eac3e5a7b97ba4cc7b7dadf2a85aae8215/absl/base/internal/thread_identity.h#L249-L252 and it also checks ABSL_CONSUME_DLL. Should we check ABSL_CONSUME_DLL here as well? Though, not sure how this is related to consuming dll's.
I see that this: https://github.com/abseil/abseil-cpp/blob/14b8a4eac3e5a7b97ba4cc7b7dadf2a85aae8215/absl/copts/AbseilConfigureCopts.cmake#L6-L9 exports ABSL_BUILD_DLL only for Windows (MSVC). Is there a macro for BUILD_SHARED_LIBS mode? Perhaps we should check that macro?
Hey @dvyukov, thanks for the response. We're using CMake to build, which correctly sets ABSL_BUILD_DLL
on windows (and ABSL_CONSUME_DLL
gets correspondingly set when building against shared abseil).
My understanding is that this mechanism originally was intended (only?) for __declspec(dllexport)
(resp. dllimport
), which is necessary to make shared builds work on windows at all.
Now the problem only appears on Linux, where usually, ABSL_BUILD_DLL
should not be a thing, according to my understanding. But if that's the preferred solution, I'm happy to set that symbol also on Linux while building.
It would be functionally equivalent to the patch I posted above, which I already tested successfully.
It would be functionally equivalent to the patch I posted above, which I already tested successfully.
If I read your first patch correctly, it effectively undoes the optimization, so it's not good.
I am not expect on absl build modes and macros, especially used in open-source. So I will defer this to absl maintainers. @derekmauro do you know who can say what's the right fix for this?
I'll take a look when I can. Please be patient.
It would be functionally equivalent to the patch I posted above, which I already tested successfully.
If I read your first patch correctly, it effectively undoes the optimization, so it's not good.
My intention was not to suggest that the patch should be merged, what I meant was that for our specific scenario (builds against a shared abseil) the removal of the #if
had the same effect as defining ABSL_BUILD_DLL
.
I'll take a look when I can. Please be patient.
I hope that a first ping after a week is not considered excessive. 😅
It is blocking a lot of work in our distribution though, so I was mainly looking for overall direction, rather than a fix. For now, I don't even know if this is an issue in abseil or in grpc, but I'm assuming that lots of places implicitly use the Mutex, and since all our builds are shared, I expect this to be too much to fix everywhere on the side of abseil-consumers (though perhaps I'm wrong though and it's grpc-specific after all!).
The upshot is that I'm looking towards just uninlining the Dtor (in our distribution) for 20240116.x, which should at least unblock us for now, and we can then reconsider for the summer release. If you think I should hold off on that for ${reason}
, please let me know.
I haven't tried this, but I ran the build "in my head" (which tends to be unreliable). I think it is likely caused by mixing an installed absl library built with -DNDEBUG
with a gRPC library that includes absl headers that is not being built with -DNDEBUG
. If this is the case, it is something I've been begging people not to do.
This is one of those build modes I would like to say is not supported, as I regularly have to tell people when they only instrument half of their builds with sanitizers, but the fact of the matter is that with -DNDEBUG
, people get away with doing this 99.9% of the time.
One option which I really don't like is to patch this for open-source users, but keep @dvyukov's optimization for internal Google users only. This would be an annoying maintainability problem. I'll have to test it to see how bad it actually looks.
Thank you for the response!
This is an interesting case, because we do set -DNDEBUG
for our release builds, but the failure occurred while naïvely building examples/cpp/helloworld
from grpc, which doesn't set that symbol.
It's surprising to me that the logic here is https://github.com/abseil/abseil-cpp/blob/2f9e432cce407ce0ae50676696666f33a77d42ac/absl/base/macros.h#L91-L100 rather than, say
#ifdef(DEBUG)
<debug-case>
#else
<nodebug-case>
#fi
The latter would not open up that footgun of a vanilla build (no options) having a different ABI than one with -NDEBUG
.
If this is the case, it is something I've been begging people not to do.
I understand, though the realities of a distribution are that we have to use pre-built artefacts. If that means that we have to put NDEBUG
in the library interface, we can do that
This is an interesting case, because we do set
-DNDEBUG
for our release builds, but the failure occurred while naïvely buildingexamples/cpp/helloworld
from grpc, which doesn't set that symbol.
OK, I misdiagnosed that (probably trying to fit your analysis subconsciously) -- the compiler setup is the same in all cases, so we do set -DNDEBUG
even for that grpc example. I've double-checked this in https://github.com/conda-forge/grpc-cpp-feedstock/pull/357.
OK, I misdiagnosed that [...]
Yet another update: while our compiler setup is the same (CPPFLAGS
contains -DNDEBUG
, CXXFLAGS
does not), it seems that abseil and grpc differ in how they pick this up - abseil consequently does build with -DNDEBUG
, while grpc (or at least the example) doesn't. Adding -DNDEBUG
to the CXXFLAGS
for the grpc example does make it work again, validating your analysis.
I have also an issue with the inlined desctructor, but on windows when building static libs of abseil. When I build ortools v9.9 which uses abseil 202401 version of abseil, I get multiply defined symbols for Mutex::Dtor(). Ortools is quite big and it uses abseil, but also Protobuf as dependencies. Up to now I did not test it with the (rollback-)patch, but this seems likely a fix for the issue.
Does this build include inconsistent NDEBUG defines (as @derekmauro mentioned here https://github.com/abseil/abseil-cpp/issues/1624#issuecomment-1967655080)?
Not that I am aware of. The ortools cmake machinery is quite complex so I cannot easily verify that, maybe I find the time to dig into that more. Another possibility could be a wrongly set or logically incorrect usage of the flag BUILD_ABSL_DLL.
I'd like to add, for anyone coming across similar problems, that not defining the NDEBUG macro in your Release build will cause very cryptic errors at the linker stage; in my case, a dependency on grpc (and therefore protobuf and abseil) managed via vcpkg in an MSbuild project on Windows produced ODR Violation linker errors exactly in Mutex::Dtor. It wasn't until I randomly stumbled across this exact thread after hours of increasingly desperate internet searches that I noticed the missing NDEBUG definition on my command line. Obvious error in retrospect, but tracing that down backwards from the link stage to the preprocessor was pretty nightmarish. It feels like this should be mentioned in documentation somewhere, so other developers don't have to go through that ordeal. Maybe this comment will help someone directly or serve as a starting point for this information to be more visible to library users.
my 2 cents for @EddyXorb and @dvyukov
In or-tools we always set /DNDEBUG
when detecting MSVC in CMake, shame on me...
ref: https://github.com/google/or-tools/blob/cb82f52e52dcc58978c1b69e8c0c5422f381986e/cmake/cpp.cmake#L85
I guess here we should use a generator expression to set it only in Release* config or simply not define it ourself aka see next point...
CMake already define the /DNDEBUG
for MSVC for RELEASE
/RELWITHDEBINFO
/MINSIZEREL
mode on Windows:
ref: https://github.com/Kitware/CMake/blob/0f88f57389fd636d76b7f726ac94933476d6c195/Modules/Platform/Windows-MSVC.cmake#L489-L491
note: There is also a Modules/Compiler/MSVC.cmake
to help us to not found it.
note: there is also a -DNDEBUG
for clang on windows...
Not sure if this is the same issue, but I'm also seeing something when building certain targets with CMake:
abseil-cpp-src/absl/synchronization/mutex.h:541:8: error: expected unqualified-id before ‘)’ token
541 | void Dtor();
| ^~~~
abseil-cpp-src/absl/synchronization/mutex.h:1070:26: error: expected primary-expression before ‘)’ token
1070 | inline Mutex::~Mutex() { Dtor(); }
Unfortunately the workaround of not inlining Dtor didn't seem to work for me. Interesting this only appears to happen with a few targets, namely our Catch2 unit tests.
Describe the issue
While https://github.com/conda-forge/grpc-cpp-feedstock/pull/348 grpc with
-DgRPC_ABSL_PROVIDER="package"
in conda-forge against abseil 20240116.0, the sanity check of compiling an example against the library fails:Previously I had opened #1614 and https://github.com/grpc/grpc/issues/35854, but it now looks like this might be related to https://github.com/abseil/abseil-cpp/commit/f3760b4d3b2773d1cb8e9ddda29dc9bce386d540. In particular, the following looks like a likely culprit https://github.com/abseil/abseil-cpp/blob/119e0d3f74733aff2999d39cb8be99ddcb081c66/absl/synchronization/mutex.cc#L734-L739
I'm not sure what grpc does that (apparently) invalidates the assumptions explained in the commit message of that change.
Steps to reproduce the problem
Build https://github.com/grpc/grpc with
-DgRPC_ABSL_PROVIDER="package"
against abseil 20240116.0 and then compile the grpc helloworld example. It would also be possible to replay the recipe we have in conda-forge.What version of Abseil are you using?
20240116.0
What operating system and version are you using?
Linux
What compiler and version are you using?
GCC 12
What build system are you using?
CMake
Additional context
No response