Closed x42 closed 8 years ago
As an audio engineer who relies on this stuff in my work, I'll add my voice to the request to consider this issue. It's been causing serious problems here.
I'm not opposed to this. There's no real performance reason not to put a mutex lock around the planner etc. as far as I can tell.
We might try to be bit careful with library dependencies, because I don't want libfftw3 to be suddenly dependent on libpthread. But that is manageable: we can just have This would not really solve the problem in general, unfortunately.fftw_init_threads
install hook functions pointers (initially NULL
) in the planner for locking/unlocking a mutex.
Probably we should just bite the bullet and add a libpthread
dependency to libfftw3
(on Unix). For the use-case here, it seems like you need thread-safety regardless of whether the threaded FFTW is being used.
(When FFTW was created back in 1997, there were still a wide variety of threading libraries that we needed to support. pthreads, Solaris threads, Mach threads.... But now life is simpler because there are only two that matter, pthreads and Windows threads, so adding a dependency on one is not so bad.)
On Unix, if we add a libpthread dependency we can use a static mutex initializer. On Windows, it looks like there is a hack to accomplish similar functionality.
There's already pthread/posix and Windows specific mutex code in threads/threads.c
- so no new dependency is needed, is it?
@x42, the trick here is that we need a single global mutex lock on the global planner state, and that global mutex lock needs to get initialized exactly once, in a thread-safe way, regardless of how many threads are linking to FFTW. That's why you need to use a static initializer or an emulation thereof.
Also, the threads/threads.c
functions are only used in libfftw3_threads
, not in libfftw3
. The former currently has a libpthread dependency (on Unix), but not the latter.
OK. so the equivalent to PTHREAD_MUTEX_INITIALIZER
for windows (and other non posix OS).
Fortunately, the only non-POSIX system that matters these days is Windows. (And of course our autoconf script can just disable the mutex on non-POSIX non-Windows systems, if configure
runs at all.)
The unfortunate thing here is that our Unix linking instructions, which haven't changed in over a decade, will now need to be changed: programs will need to link -lfftw3 -lm
+ whatever is needed to link pthreads. Linking pthreads used to be insanely complicated and non-portable, but maybe everything just uses -pthread
now. (Or rather, all the other Unices have mostly died off.)
Still, it will potentially break a lot of people's Makefile
s. (Although as long as they are linking the shared-library version of FFTW, I guess the shared-library dependency mechanism on GNU/Linux will take care of it silently for them. People who compile libfftw3.a
and link it statically are in for a surprise, though.)
..no real performance reason..
indeed. the overhead of locking/unlocking a mutex is negligible.
..Unix linking instructions..
-pthead
is a compiler/linker flag understood by almost all compilers these days, but yes: LD_FLAGS
will have to change. Alternatively a --no-thread-safe-planner
configure option could be added.
It's not so negligible that I would want locking/unlocking in an inner loop. Moreover, we are talking about a single global lock here, which forces serialization, so you wouldn't want this for any performance-critical function like plan execution. But none of the functions that need locking here are inner-loop functions (plan execution is already documented to be thread-safe), so it doesn't matter.
We should certainly have a --disable-threadsafety
flag as an option, but it will still require people to change their builds if they want to use it; most users would be better off just adding -pthread
to their makefile.
in that case, just adding it to fftw.pc.in
may do the trick
Partial(?) list of functions that need locking, for future reference:
the_planner
and cleanup
in api/the-planner.c
, and every function (see api/*.c
and */api.c
) that calls the_planner
.twlist
hash table).valgrind
instead.rader_tl_find
: this function is thread-safe, but it is typically called on global twiddle caches. In particular, dft/rader.c:mkomega/free_omega and rdft/dht-rader.c:mkomega/free_omega.threads.c
functions already have the necessary locking (WITH_QUEUE_LOCK
).@x42, we will certainly add it to fftw.pc, but not everyone uses pkgconfig
.
Would something like this solve the problem:
This "solution" works as long as any thread that depends upon a thread-safe planner calls init_threads() first.
I guess the question is whether it is feasible to expect FFTW-using audio plugins (and similar) to link -lfftw3_threads
and call fftw_init_threads()
. It doesn't seem crazy to me, but I don't know the audio-plugin space.
What is the problem with libfftw becoming dependent on pthread on unices that have pthread by default?
The only downside I see is that some single-threaded applications would have to add -pthread
(or -lpthead
) to their link flags and most won't notice nor care if it's picked up via pkg-config or autoconf et al anyway.
As for the implementation: How about adding abstract fftw_mutex_un/lock()
methods to libfftw itself and calling those at every planner entry and exit point? On GNU/Linux and OSX (and other unices which have pthread by default) fftw_mutex_un/lock()
will simply expand to pthread calls at compile-time, on Windows it'll call InterlockedCompareExchangePointer
etc. On small embedded or other systems it would may become a noop (or we can add a spinlock or a custom mutex implementation).
If pthread by default is a problem. This approach would also lend itself to create libfftw3
, libfftw3f
, libfftw_thread
, libfftw3f_thread
. All a plugin author would have to do is change the lib-name to link against (no code changes needed).
-=-
A solution that does not require plugin code to change is preferable.
There are a lot more plugins than hosts and plugins are killing us already ("Look Ma, I made a new plugin.." crash). Host authors are usually more experienced and conscientious. The issue at hand is rarely caught by plugin-authors, since it only manifests when multiple instances are run or different plugins are combined in a host.
However, if fftw3_threads
and fftw3
are ABI compatible and every planner call does preemptively call init_threads()
(inside fftw3_threads
) it would still work for cases where the plugin-author takes no action: On Linux and OSX: the host links against it plugins inherit its symbols. On Windows it depends (linking vs GetProcAddress()
- but that's a different issue).
The problem is potentially requiring tens of thousands of people to change their Makefiles, which is not a step we take lightly. But I'm also very sympathetic to your argument that plugin authors may not be sophisticated, and FFTW should be safe by default if possible. (At least failing to include -pthread
will lead to a "noisy" failure.)
Would something like this work:
Yes, that sounds like a plan and an elegant solution.
Theoretically that would also allow for libfftw to set those callbacks itself for some platforms, one day.
It does not define whose responsible it is to set those functions. In case the plugin-host itself does not use fftw, but only independent plugins (which are loaded dynamically) use it, that could still fall over, but in the context of pro-audio, the host should link against libfftw just to set the callbacks preemptively.
A suggestion for the API: it should be possible to set both hooks atomically (just in case two independent threads try at the same time). Besides, it does not really make sense to install only one hook.
typedef void (*FFTWPlannerLockCallback)(void *arg);
/**
* set before_planner_hook and after_planner_hook callback
* functions if they were not previously set.
*
* @param b callback function to be called before starting a fftw plan or NULL to unset.
* @param a callback function to be called after completing a fftw plan or NULL to unset.
* @param replace if nonzero, set callbacks even if hooks are already defined.
* @return 0 if hooks were installed, otherwise error-code
*/
int fftw_set_planner_hooks (
FFTWPlannerLockCallback b,
FFTWPlannerLockCallback a,
int replace,
void *arg
);
ping!
any news on this?
See 113e1086966fdff4c172672753cc880e6bc74d3d
I am still missing the arg parameter to the two hooks. Do we want one or two args?
Has there been any more progress on this?
Hi All,
A ping on this issue - this issue is discussed periodically on IRC in various linux-audio related channels, and would be really nice to have fixed.
I'm currently looking towards using a different library for FFT for an upcoming project due to this issue possibly causing crashes in software designed for on-stage live performance.
The solution proposed with both callbacks set in one function looks a good fix to me. Cheers, -Harry
I'll add my voice to the choir! :)
WRT a simple global lock - there's no need for pthread or anything heavy. A simple by-the-book lock should be enough (... I think). GCC defines intrinsics for atomic operations (and they are available on most architectures), so it's quite trivial to implement a ticket lock on a pair of statically-initialized to 0 integer. I've pushed a trivial example on https://github.com/rdolbeau/fftw3/tree/mutualexclusion. Just add "--enable-mutualexclusionplan" to configure, and the planner should be "thread safe" (as in, serialized :-). Beware: active wait, and not really tested at all.
Current status on this issue:
The next fftw-3.3.5 will support fftw_set_planner_hooks(before, after, arg). Before planning, fftw calls before(arg), and after planning it calls after(arg).
fftw_set_planner_hooks() is not thread safe; the user is responsible to ensure that it is called once. @rdolbeau : it is a bad idea to add a gcc dependency (__sync* intrinsics) so deep in FFTW---whatever the API we end up with, we must support it everywhere (including e.g. Windows without pthreads).
Please offer feedback on whether or not this solves the problem. If it does I'll document it in the manual, and if it does not I will remove the hook completely.
@matteo-frigo The __sync* intrinsics are supported by ICC and CLANG as well. For windows, there is a similar mechanism (InterlockedAdd ?). Basically, you only need a way to do "x = * y; (* y)++" atomically to implement a ticket lock. And it doesn't require any library support beyond the compiler itself, so it avoids a dependency on pthread. Hooks can work, but require application-level support.
A hook for locking offers a nice flexible API, but that requires each and every application to explicitly use it. Also in the Host+Plugins case it can quickly end up being a mess: Who is responsible to register the hooks? (e.g. A host is not using fftw, but some plugins loaded into the host do. Plugins can be un/reloaded.)
I'm all for hooks, but they should by default point to an internal simple lock (a ticket spinlock is fine. I like Romain's implementation.)
The reasoning is as follows:
The spinlock implementation is a bad idea for at least two reasons.
First, the planner may take minutes, and the last thing you want to do is spin while the planner is running.
Second, we aim to be much more portable than "icc" and "clang". Look at kernel/cycle.h for a partial list of systems where the lock hack is expected to work.
The current status is that the fftw core (e.g., non-simd) requires ANSI-C and nothing else (no threads, no c99, no nothing), and the FFTW API works in the same way in all systems. It will take an extraordinary amount of evidence to convince us to break this property.
How about this.
To the threaded api, we add a call fftw_make_planner_thread_safe() (or something) that atomically installs the appropriate hooks, and is guaranteed to be atomic and idempotent.
Plugins that care about thread safety will have to call fftw_make_planner_thread_safe(). The fftw core does not depend upon pthreads.
If we go down this path, I will remove the hooks for lack of a use case.
Why is there global state in FFTW at all? By that I mean, why doesn't the interaction with FFTW begin by allocating some kind of context structure which is passed to all the other FFTW functions and contains the "global" state. This would avoid any need for behind the scenes locks.
The problem is not to argue what made or didn't make sense in 1997 when fftw started. The problem is to decide how to move forward productively from where we are.
On Mon, May 25, 2015 at 5:59 PM, Davis E. King notifications@github.com wrote:
Why is there global state in FFTW at all? By that I mean, why doesn't the interaction with FFTW begin by allocating some kind of context structure which is passed to all the other FFTW functions and contains the "global" state. This would avoid any need for behind the scenes locks.
— Reply to this email directly or view it on GitHub https://github.com/FFTW/fftw3/issues/16#issuecomment-105324764.
Obviously don't break the existing API. But, for example, why not add a version of fftw_plan_dft_1d() that doesn't have any global state? Call it fftw_plan_dft_1d_r() and make it either take an additional context object or make it simply allocate and deallocate whatever it needs internally.
fftw_make_planner_thread_safe()
is so far the best option on the table.
It still requires application level support but that may be an acceptable compromise.
Why can't this be done by default if fftw3 is compiled with --enable-threads?
@davisking fftw_plan_dft_1d
and fftwf_plan_r2r_1d
and fftwf_plan_dft_r2c_1d
and probably a few more... special casing a single plan is not a solution.
A similar workaround is just copy all of fftw into your project (the global state won't be shared with other projects that use system-wide libfftw.so, you only have to worry about multiple instances of your own project which is trivial given that you have control over the fftw configuration as part of your project). Then again almost all gnu/linux distros refuse to package a project like that and instead ask to get things upstream..
Yes, I mean do it for all the planners. Everything else we are talking about are a bunch of hacks to work around the problem imposed by a lack of reentrant planning functions in the API. As you and others have pointed out, the other solution involves encouraging fftw users to do things they will either refuse to do or will complicate their build systems and likely lead to package managers (or other downstream users) accidentally building things without the appropriate thread safety switches turned on.
Is it really that hard to push the global variables used by the planners into an argument? I haven't looked at FFTW's code but I can't imagine it's a huge deal to refactor it into that form. Then you implement the existing non-thread-safe APIs in terms of the reentrant APIs to avoid breaking backwards compatibility.
@matteo-frigo Does the problem also exist if fftw is built without --enable-threads
but an application calls fftwplan* from different threads?
@x42 (et al.):
Whether this is a "problem" or not depends on your perspective.
FFTW is meant primarily to be called from one thread and use threads internally to speed up the computation. This is a common mode of operation of numerical libraries: you write your sequential program that calls fftw/atlas/whatever, and the numerical library handles the parallelism. The core of FFTW consists of one library that does not depend on pthreads (e.g., because the user may want to use openmp instead of pthreads, or not want threads at all). --enable-threads produces another library libfftw3_threads that exports a function fftw_init_threads() that installs the threaded solvers into the fftw core. Such a library is optional; --enable-openmp installs openmp solvers instead (and other awful variants exists; for example support for the Cell processor adds solvers that use the coprocessors).
Thus, the intended usage of fftw is the opposite of what the commenters on this ticket have in mind, that is, calling fftw from multiple threads and run fftw independently in each thread.
The root of the problem is that it is hard to have both setups. Since fftw --enable-threads uses threads internally, the internal threads are a shared resource that is hard to manage if we allow multiple threads to enter the planner at the same time. To put it another way, threads do not compose. Attempts to provide a better abstraction parallelism (such as Cilk or TBB) have not won in the marketplace, so we are stuck with the problems of threads.
I think that the suggestion of passing a planner "object" to all planning function is fundamentally misguided because it attacks the syntax of the problem but not its essence. At its root, threaded fftw has a bunch of global resources (internal threads) that cannot be abstracted away via syntactic sugar. Thus, this suggestion is fundamentally incompatible with --enable-threads. (For the record, this suggestion would not be hard to implement. fftw internally does in fact pass an explicit planner "object" pointer everywhere in the way suggested by davisking. The API layer deliberately uses a global planner precisely not to sweep the real problem under the rug.)
I think that my proposal of fftw_make_planner_thread_safe() is the only practical workaround: it recognizes that threads do not compose, that fftw is set up so that parallelism is inside fftw and not outside, it works around the lack of composability via a lock, and it does not break any existing code. This proposal is currently implemented in git, so folks may want to try it out and see if it works for them.
Nice, I can confirm that fftwf_make_planner_thread_safe()
works and resolves the issue.
However, so far there is no C-header for that function, and -lfftw3f_threads
need to be added manually. It'd be great if pkg-config --libs fftw3f
could take care of that.
The test was performed on debian/stable, fftw3 git
# git describe --tags
fftw-3.3.4-55-g8cd9bfa
# ./configure --enable-single --enable-sse --enable-avx --enable-shared --enable-static --enable-threads
PS. How can one easily check if that function is available at compile time? Is that best done during configuration pkg-config --atleast-version=3.3.5 fftw3f
or will there be a header define in FFTW3 itself?
PS. the test I ran is https://gist.github.com/x42/07391de61b99e872ad9d
without fftwf_make_planner_thread_safe()
there are regular crashes.
and as mentioned previously, after calling fftwf_make_planner_thread_safe()
everything works fine.
Any news on this?
looks like all that's left to do is adding fftwf_make_planner_thread_safe
to the header and -lfftw3f_threads
to the pkg-config libs.
Hi guys I m no programmer and got very few basics skills with linux but I love Ardour and calf plugins. Could someone help me resolve the crash when loading some .ardour saved session? I dont understand how this fftw work..! Thx
What do you mean by "adding to the header"? fftwf_make_planner_thread_safe() is already in the header fftw3.h:
FFTW_EXTERN void X(make_planner_thread_safe)(void);
On Mon, Sep 7, 2015 at 10:55 AM, Robin Gareus notifications@github.com wrote:
Any news on this?
looks like all that's left to do is adding fftwf_make_planner_thread_safe to the header and -lfftw3f_threads to the pkg-config libs.
— Reply to this email directly or view it on GitHub https://github.com/FFTW/fftw3/issues/16#issuecomment-138318123.
indeed, there it is. sorry for the noise. The include works now. -lfftw3f_threads
is still needed though.
Even when configured with --enable-threads
the .pc file only references -lfftw3
(or -lfftw3f
in single mode).
PS. I suppose last time I tested the system-wide fftw3.h
was preferred over the the local version (using PKG_CONFIG_PATH).
We cannot add -lfftw*_threads to the pkg-config file, because this would force all applications to link to the pthreads library. This whole topic was about how to avoid forcing such a dependency.
On Tue, Sep 8, 2015 at 11:29 AM, Robin Gareus notifications@github.com wrote:
indeed, there it is. sorry for the noise. The include works now. -lfftw3f_threads is still needed though. Even when configured with --enable-threads the .pc file only references -lfftw3 (or -lfftw3f in single mode).
PS. I suppose last time I tested the system-wide fftw3.h was preferred over the the local version (using PKG_CONFIG_PATH).
— Reply to this email directly or view it on GitHub https://github.com/FFTW/fftw3/issues/16#issuecomment-138600776.
How do you envisage to solve this? A second pkg-config file fftw3[f]-threaded
?
I'll close this. The original issue is fixed in fftw3-git.
It'd be nice if there'd be an easy way to find out if a given version of fftw in a system has fftw3_threads.
Best solution I can currently think of is a 2nd .pc
pkg-config file, but that's a different story.
Thanks to all involved involved in fixing this issue.
Doesn't fftw_destroy_plan
also need a lock, since it accesses shared state for trig tables?
@stevengj Yes, it's a bug, I'll fix it
http://www.fftw.org/fftw3_doc/Thread-safety.html mentions
I hope I can convince you otherwise.
Problem Description
fftw3 is used by a variety of audio plugins.
Those plugins are loaded into the host's memory-space (usually an audio workstation). The host has limited control of what the plugin does internally, and the plugins do not know about each other.
There is no way to ensure that two independent plugins which are linked against libfftw do not run the shared planner simultaneously. Nor is there a possibility to control this on host application level.
When two independent plugins create fftw plans the application usually segfaults or similar undesired effects manifest.
Possible solutions for this include:
Discussion
The issue at hand is not limited to audio-application, there are likely other applications with similar problems out there (gnu-octave comes to mind, but I don't know for certain).
As the Thread-safety page mentions, it's as simple as
Is there some good reason why libfftw does not do this by default?
Existing applications should not be affected by this (they're not supposed to call the planner from different threads), but that change would make all the difference for multi-threaded plugin hosts.
I suppose it could be a bit of work to wrap all planner entry-points with a semaphore, yet there may be a neat simple solution using
#define
.I'll be happy to look into this, but before going that way, I'd like to ask if such a change would be accepted by fftw or if there is an even better solution planned for future version that will make fftw's planner thread-safe.
yours truly, robin - for the linux-audio community and for himself
Notable audio plugins using fftw3: http://calf.sourceforge.net/ http://factorial.hu/plugins/lv2/ir http://guitarix.sourceforge.net/ http://breakfastquay.com/rubberband/ http://plugin.org.uk/ http://zynaddsubfx.sourceforge.net/ https://github.com/x42/meters.lv2 ...
Notable affected plugin hosts: http://ardour.org/ http://qtractor.sourceforge.net/ https://github.com/falkTX/Carla/ ...
see also https://community.ardour.org/node/8271