mesonbuild / meson

The Meson Build System
http://mesonbuild.com
Apache License 2.0
5.3k stars 1.52k forks source link

doc/Threads: avoid `dependency('threads')` with MinGW + Win32 threading #13124

Open bgilbert opened 3 weeks ago

bgilbert commented 3 weeks ago

On MinGW, dependency('threads') assumes the caller wants POSIX thread primitives instead of Win32 primitives. This makes sense because the latter doesn't require any additional action, while the former needs a winpthreads dependency. However, some libraries use Win32 threading on Windows but add dependency('threads') anyway, which is undesirable in a build that wants to avoid shipping winpthreads. Recommend skipping dependency('threads') on Windows if POSIX threading is not needed.

bgilbert commented 3 weeks ago

This is a somewhat obscure footgun. MinGW GCC can be compiled with either a posix or win32 threading model (x86_64-w64-mingw32-gcc -v 2>&1 | grep 'Thread model'), which affects the implementation of compiler intrinsics like __thread. Many MinGW toolchains use posix because win32 doesn't fully support C++'s std::thread, and in the posix case, any use of thread intrinsics will pull in winpthreads anyway. So, in many cases, the overlinking caused by a stray dependency('threads') may not be noticed.

It seems worth writing down though. I've ended up PRing three different dependencies to fix this issue for my use case, and it'd be nice to have some canonical upstream docs.

dcbaker commented 3 weeks ago

I would much rather just fix 'threads' do the the right thing, including being able to say that you want win32 threads instead of pthreads. I had a patch at one point to be able to query whether the threads you got was pthreads compatible or win32 threads compatible, but it got bikeshedded to death

jpakkane commented 3 weeks ago

But is it even knowable? If you can have both win threads and posix ones in the compiler toolchain then you need to be able to say something like dependency('threads', use_winthreads: true)?

dcbaker commented 3 weeks ago

yeah, or dependency('threads', api : 'posix' | 'win32' | None) where None means, I like that a little better, but I could be persuaded)

The only places i can think of where you can run into issues is with winpthreads, either as a standalone library or as part of msys? In those cases you'd have to have either a keyword or use some sort of introspection method like dependency('threads').get_variable('model') to introspect whether it found a win32 or posix thread implementation.

I suspect @lazka either has a better idea of what kind of pain we'd be taking on by trying to fix this, or know someone who does.

bgilbert commented 3 weeks ago

To back up a bit, a program will acquire a winpthreads dependency for one/both of two reasons, which are otherwise orthogonal:

  1. If MinGW uses the posix compiler thread model, it will add the winpthreads dep if the program uses thread intrinsics.
  2. The program may explicitly call pthread APIs, in which case it needs winpthreads.

For case 1, Meson could provide a way to detect the thread model (but can't know whether the program actually uses thread intrinsics), but that seems out of scope: while the person invoking Meson might want to avoid the winpthreads dep, the program itself is unlikely to care in any way.

For case 2, the program knows which threading API it wants, and should specify it. Querying afterward doesn't make sense: "Meson, please pick an arbitrary thread API, and then I'll decide whether I like it" isn't the usual UX. Rather than adding special syntax as @dcbaker proposes, maybe we should define two new threads-posix and threads-win32 dependencies, and have threads throw a deprecation warning?

dcbaker commented 3 weeks ago

I'd like to avoid having to do things like:

if host_machine.system() in ['linux', 'freebsd', 'openbsd', 'netbsd', ...]
  threads = dependency('threads-posix')
else
  threads = dependency('threads-win32')
endif

In c++ especially it's common to not care, you're going to use <threads> anyway, so it doesn't matter what the backing implementation is.

I know of at least one other case where the project provides it's own level of abstraction, and does it through #define tricks that have to be special cased in weird ways for mingw

bgilbert commented 3 weeks ago

I was thinking more like

thread_dep = dependency('threads-win32', required: false)
if not thread_dep.found()
  thread_dep = dependency('threads-posix')
endif

and of course, only for programs that support both thread APIs.

In c++ especially it's common to not care, you're going to use <threads> anyway, so it doesn't matter what the backing implementation is.

Ah, good point. I don't think it's possible to implement that generically though, without knowing something about the underlying implementation. For example, if the MinGW win32 model ever gains support for <threads>, we'd need to detect the compiler's threading model to know whether to add -pthread when linking. Maybe this case is better handled with an explicit threads-c++ dependency type? The contract of the existing threads dependency is pretty ambiguous.

I know of at least one other case where the project provides it's own level of abstraction, and does it through #define tricks that have to be special cased in weird ways for mingw

Yup. The projects that support both Win32 and pthreads are the ones I'm especially trying to improve here.

dcbaker commented 3 weeks ago

Yup. The projects that support both Win32 and pthreads are the ones I'm especially trying to improve here.

And I want to do this without degrading the current situation where dependency('threads') nicely handles what you want, by requiring users to copy and paste a pile of code. The reason that we added threads in the first place was to have a portable, abstracted, way to ask for threading support instead of cc.find_library('pthreads'), it feels like going in reverse to do away with the abstraction and require in the common cases for people to check this manually. I'd much prefer to keep the common cases working, and provide specific tools for projects that need it to make more fine tuned decisions.

bgilbert commented 3 weeks ago

And I want to do this without degrading the current situation where dependency('threads') nicely handles what you want, by requiring users to copy and paste a pile of code.

That's kinda the situation now, though. I copypasted dependency('threads') into my multithreaded projects because the docs implied that I should, but it's completely unclear what that actually does.

For projects sophisticated enough to support multiple threading APIs, I think four legible lines of dependency checks are not unreasonable. The other projects will use whatever threading API they use, and those would just have dependency('threads-posix') or dependency('threads-c++') or whatever.

If there's strong opposition to deprecating dependency('threads'), we should at least document what it actually does. I think its semantics are pretty hard to justify when actually written out, and it doesn't seem possible to fix them in-place without breaking compatibility.

The reason that we added threads in the first place was to have a portable, abstracted, way to ask for threading support

As I laid out above, this isn't actually possible without knowing what threading API the application intends to use.

mstorsjo commented 3 weeks ago

This is a somewhat obscure footgun. MinGW GCC can be compiled with either a posix or win32 threading model (x86_64-w64-mingw32-gcc -v 2>&1 | grep 'Thread model'), which affects the implementation of compiler intrinsics like __thread.

Can you elaborate on this? Thread local variables aren't really implemented with either of them, as far as I know; with GCC they're implemented with "emulated TLS".

I tested the output of

extern __thread int var;
int get(void) {
    return var;
}

And it's identical between x86_64-w64-mingw32-gcc-posix and x86_64-w64-mingw32-gcc-win32 in e.g. Ubuntu 22.04.

As far as I know, the choice between the thread models does not affect code generation in any way, it just changes which backend implementation is used in libgcc and libstdc++.

Many MinGW toolchains use posix because win32 doesn't fully support C++'s std::thread,

There's a recent change on this front, though - since GCC 13, std::thread should work on top of plain win32 threads now as well - since https://github.com/gcc-mirror/gcc/commit/9149a5b7e0a66b7b94d5b7db3194a975d18dea2f, as long as it is built with _WIN32_WINNT >= 0x0600.

and in the posix case, any use of thread intrinsics will pull in winpthreads anyway.

As noted, I don't think __thread or other thread local stuff should play a role here. However, when configured with posix threads, GCC always automatically tries to link against winpthreads in any case (although e.g. the DLL only gets linked in if it actually is referenced).

Also - for mingw toolchains with Clang, there's no similar big global switch for "thread model". clang --version does print Thread model: posix, but that's not connected to anything (I think it always just prints that value). When using clang with libstdc++, one has to link with whatever libstdc++ was built with (and enable emulated TLS; e.g. the msys2 clang distributions that use libstdc++ have these defaults hardcoded correctly). When using clang with libc++, libc++ usually defaults to plain win32 API for threads (with full support for std::thread etc), but it can also optionally be built on top of winpthreads (although I think that's kinda rare to do).

lhmouse commented 3 weeks ago

I think that dependency('threads') should be no-op on Windows either way, as no library appears to be necessary for threading.

GCC with posix thread model links pthread by default unless -no-pthread is specified. ^1

bgilbert commented 3 weeks ago

@mstorsjo

Can you elaborate on this? Thread local variables aren't really implemented with either of them, as far as I know; with GCC they're implemented with "emulated TLS". [...] As far as I know, the choice between the thread models does not affect code generation in any way, it just changes which backend implementation is used in libgcc and libstdc++.

Ah, yeah, good catch. So under posix, the libgcc dependency could pull in winpthreads even if intrinsics are not used.

since GCC 13, std::thread should work on top of plain win32 threads now as well - since gcc-mirror/gcc@9149a5b, as long as it is built with _WIN32_WINNT >= 0x0600.

Nice, that's really good news!

The additional info about thread models is helpful, thanks. For clarity, I had mentioned the posix thread model mainly as an additional reason a build might depend on winpthreads, thus masking the ambiguity of dependency('threads') in a way that the win32 model does not. But, AFAICT, neither thread model directly imposes requirements on a revised design for dependency('threads').


@lhmouse

I think that dependency('threads') should be no-op on Windows either way, as no library appears to be necessary for threading.

If the application was developed for Linux, dependency('threads') probably means "we want to use pthreads". If the compiler uses the win32 thread model, pthreads will not be linked in unless Meson does so explicitly.

lhmouse commented 3 weeks ago

@bgilbert

If the application was developed for Linux, dependency('threads') probably means "we want to use pthreads". If the compiler uses the win32 thread model, pthreads will not be linked in unless Meson does so explicitly.

Our conclusion (with @mstorsjo) is probably that dependency('threads') should be asking for a dependency for whatever sufficient for the standard library on a given platform. There is no such dependency on Windows, not even with GCC with the posix thread model. If a user would like to use pthread APIs, they will have to specify dependency('pthread') instead.

bgilbert commented 3 weeks ago

Our conclusion (with @mstorsjo) is probably that dependency('threads') should be asking for a dependency for whatever sufficient for the standard library on a given platform. There is no such dependency on Windows, not even with GCC with the posix thread model. If a user would like to use pthread APIs, they will have to specify dependency('pthread') instead.

To be clear, dependency('pthread') currently does not exist, and would need to be added to Meson. But why is it useful to ask for "the standard library on a given platform"? That would mean the application is declaring support for every default threading API that exists on any current or future platform.

lhmouse commented 3 weeks ago

@bgilbert

To be clear, dependency('pthread') currently does not exist, and would need to be added to Meson.

Yes, it's only a suggestion. I think it should not be confused with pthread.

But why is it useful to ask for "the standard library on a given platform"? That would mean the application is declaring support for every default threading API that exists on any current or future platform.

Notably, std::thread, std::mutex etc. require only a subset of pthread. Requesting threading support does not involve the other facilities, such as pthread_kill(), pthread_exit(), pthread_cancel(), and so on.

dcbaker commented 3 weeks ago

I'm not opposed to having a way to specify that you want additional support beyond the basics of C++ Threads or C11 threads, like full pthreads. My preference would be to extend the threads dependency with a keyword rather than adding additional dependency types, since right now all that threads does under the hood is add -pthread on *Nix and nothing on Win32.

I'm not sure how to proceed with actual implementation changes those.

TheOneric commented 3 weeks ago

I was lead here while trying to figure out which threading API(s) dependecy('threads') looks for since current docs are silent on this. More clarity about this as added by this PR would be welcome and imho makes sense right now regardless of whether threads should be extended in the future.


But on the risk of derailing this documentation PR further into off-topic territory: The discussion about ideal dependecy('thread') behaviour so far was mostly focussed on POSIX and Win32 threading APIs, with some mention of ISO C11 threads. I’ll add there’s also illumos/Solaris threading (sched.h and thread.h).

Having just one dep to enable whatever threading API without knowing which doesn't make much sense to me as implementation code will differ depending on the chosen API. EIther the returned dep object needs to indicate what API was selected, or for consistency with older version, keep only looking for pthread if no flavour arg was passed. At the same time a program may support more than one threding API and i think keywords would allow a cleaner check for this like:

has_threads = false
foreach thread_flavour : ['iso', 'win32', 'pthread', 'solaris']
  tdep = dependency('thread', flavour: thread_flavour, required: false)
  if tdep.found()
    deps += tdep
    has_threads = true
    conf.set('CONFIG_THREADS', 1)
    conf.set('CONFIG_THREADS_@1@'.format(thread_flavour.to_upper()), 1)
    break
  endif
endforeach