Open o11c opened 8 years ago
@o11c This all seems bad for performance. Is there a C++ equivalent to the simple, no threading required, python signal handlers? Or maybe we can handle signals only in tqdm.update()
. We also don't want to interfere with any existing SIGALRMS
.
For the typical case (SIGALRM
or disabled auto-updates), no threads will be created by tqdm.
All that we're doing is allowing the user to call the library from any thread that they create.
As far as not interfering - we do have a choice of a couple different signals, but that's why it's an option anyway.
@o11c would it result in a slowdown for cases where nobody uses threads? (also: misclicked there)
@o11c We don't want to sacrifice serial speed for parallel speed - 90% of use cases will be serial.
this is interesting: https://github.com/mkj/dropbear/blob/master/progressmeter.c
In order to allow updates without user intervention, it is critical to allow
refresh
within a signal handler. It would also be quite beneficial to allow all operations from multiple threads.This is quite feasible if you're a little careful. The major limitations of signal handlers is that you're not allowed to call
malloc
, and you're not allowed to access any variables not markedvolatile
. Since we want threads, we might as well upgrade tostd::atomic
, which has a nice API, too.Some operations need
malloc
, and thus must be done outside the signal handler:tqdm
instances (or destroying them).tqdm_sink
instances (I believe we want explicit control rather than recalculating from thefile=
stuff - and we definitely want separate_instances
lists).(On the other hand, we could do all memory allocation using
mmap
, which is allowed in a signal handler ...)Operations that can be done in the signal handler:
tqdm
instancerefresh
(including afterupdate
)write
ing to atqdm_sink
(since this is just arefresh
) - I haven't reasoned out all the collision logic yet for this case.Each
tqdm_sink
contains an intrusive (to prevent extra memory allocations) doubly-linked (to allow removal - or perhaps we could use the indirect-pointer trick and singly-link it? I'm actually not sure about how much more effort is needed with the atomics part here - do we need spinlocks (which are problematic in signal handlers since the thread is no longer executing)?) list of activetqdm
instances. This are added using the standard atomic CAS trick in a loop. We should also allow adding other sorts of permanent line.If output is to a
pipe(2)
, then writes of underPIPE_BUF
(at least 512 - plenty for a typical terminal line) are guaranteed to be atomic. If we getEAGAIN
, we just flag the currenttqdm
as dirty (replacing the Python implementation's no-implicit-refresh-within-N-seconds logic) and move on. If output is not to a pipe (or if lines are super long), we aren't guaranteed this behavior, but we should optimistically assume it will be so and fall back to a loop if we get a partial write. We will necessarily assume that no one else is writing to our fd.If output is to a terminal or a TCP socket, we can call
TIOCOUTQ
to check if the buffer is getting full or not. This might allow even better decisions about whether to attempt a write or just mark it as dirty and wait until the next (SIGALRM
or threaded) timed refresh.Obviously, use of many of these features should be controllable by the user on a per-
tqdm_sink
basis - and probably also fully-disablable via macros. If we need C++98 compatibility, we'll have to use boost for atomics, threads, etc. We might also want to allow integration in someone else''s event loop, but I haven't thought out all the requirements and implications.