Closed hartwork closed 10 months ago
Hi! First of all I'm really grateful for your help and submitting all the PRs fixes etc. However I would like to have a little bit of discussion before merging this. I really do have a personal problem with use of pthreads for ttyplot.
Most of my own use of ttyplot is on legacy/vintage unix systems with real serial terminals / consoles. Some of the systems don't support pthreads. I also wouldn't want to do it in a text mode monitoring app like. It will make it unnecessarily complex compared to the problem at hand. Or I would think. I do understand that there is an undefined behavior in the signal handling. However I have not seen, or experienced this ever causing any issue in the practice.
Longer term, I'm planning to develop a new version of ttyplot in more modern programming language like Go, use termui or something similar. I would rather keep the C version of ttyplot compatible with more vintage unixes.
I also have a proposal for another solution, that I have been considering. I haven't tested it yet. The idea is to have some control character in the input stream to trigger refresh, etc. The character would be pushed to the input stream using ungetc()
from a signal handler. Then the main loop would execute this by receiving it from the input stream. I do understand that this is not async safe, however perhaps we could select/poll on stdin. Let me know if this is even more insane.
AFAIK, slect()
and poll()
are the standard way of monitoring multiple things in parallel within a single thread, so it looks like a reasonable approach to me. Note, however, that waiting for both a file descriptor and a signal with either of these functions creates a race condition: see the section Combining signal and data events of this select tutorial.
The most elegant way of solving this race is IMO to convert the signal to a file descriptor using signalfd()
. This call, however, is Linux-specific, and will likely not be available on your vintage Unixen.
The second best option is to use pselect()
, which was standardized in 2000 by POSIX.1g. Is this available on your systems? Failing that, you may use the old self-pipe trick, which seems still cleaner than ungetc()
. Or you may just accept the race and mitigate the issue with a timeout.
Hi @tenox7 and @edgar-bonet, thanks for your replies!
There are many distinct points in your replies, let me address each one on its own and end with the more positive ones regarding a potential way forward:
To see ttyplot at its worst I recommend: feeding /dev/random
to stdin, frequent manual window resizes, compiling with UBsan and ASan enabled. You'll see crashes inside curses and signed integer overflows sooner or later. I haven't had time for a closer look at the integer overflows yet.
Regarding complexity: signal handling has inherent complexity. If the choice is embracing that complexity or undefined behavior then the only right answer in my book is complexity: undefined behavior and crashing is just not acceptable to me.
Regarding a re-write:
Regarding mixing stone age systems with modern systems in the same software: Can't old systems use an older version of ttyplot and Git master
stops supporting stone age? If ttyplot master
is meant to support stone age systems still today I would personally suggest and ask to make that super visible in the readme. I think if had known that from the beginning, it would have changed everything for me and my contributions. I remember when I found ttyplot mentioned on StackOverflow I was amazed and also puzzled that it didn't exist in Gentoo back then. Regarding legacy again, I remember I rejected a contribution of a plan9 mk build system to upstream Expat and it was the right decision to keep the museum out of today's Expat.
I have tried select
myself before going for pthreads but to my surprise it said that stdin
was always readily readable and so I ditched it as useless earlier for the moment. Maybe I used it wrong, would need a revisit and a closer look. Reading from stdin in a non-blocking way — probably single characters into a buffer to pass that to sscanf
when enough — may be a way to save on threads altogether. To be fair that would also add complexity.
Regarding the self-pipe trick, it would also be an option if writing to a pipe in a signal handler (1) is defined behavior — https://wiki.sei.cmu.edu/confluence/display/c/SIG31-C.+Do+not+access+shared+objects+in+signal+handlers doesn't sound like it — and (2) can be made non-blocking (which I'm not sure how that would work given that pipes need a reader before writing).
I'd be happy about any ideas regarding the select-always-ready and the non-blocking pipe writes.
PS: @tenox7 is there a chance to get #97 reviewed in isolation before this one?
@hartwork wrote:
Regarding the self-pipe trick, it would also be an option if writing to a pipe in a signal handler (1) is defined behavior — https://wiki.sei.cmu.edu/confluence/display/c/SIG31-C.+Do+not+access+shared+objects+in+signal+handlers doesn't sound like it
It looks like defined behavior to me. The pipe is not a shared resource: it is a kernel object. Instead, each end of the pipe (a file descriptor) is an individual resource. The signal handler would be the only code to access the input end of the pipe. Also POSIX says write()
can be safely called from a signal handler.
and (2) can be made non-blocking (which I'm not sure how that would work given that pipes need a reader before writing).
IIRC, a file descriptor can be set to non-blocking mode.
@hartwork wrote: It looks like defined behavior to me. The pipe is not a shared resource: it is a kernel object. Instead, each end of the pipe (a file descriptor) is an individual resource. The signal handler would be the only code to access the input end of the pipe. Also POSIX says
write()
can be safely called from a signal handler.
@edgar-bonet interesting — thanks!
and (2) can be made non-blocking (which I'm not sure how that would work given that pipes need a reader before writing).
IIRC, a file descriptor can be set to non-blocking mode.
Yes! I'll play with that and wait for a reply from @tenox7 on approaches he'd accept, not yet sure in which order, will see.
Now that ttyplot is included in so many distros (thank you all) I don't want to impede development for modern systems. People can use older versions of ttyplot on older OSes. However I still think that pthreads is an overkill for a very simple text mode app like that. I would appreciate if you could try implementing/testing the self-pipe trick. If it works and it's stable, I would prefer it to pthreads. If not, I will accept this. Thank you!
@tenox7 wrote:
I would appreciate if you could try implementing/testing the self-pipe trick.
The nice thing about this trick is that it works on ancient Unices. If, going forward, ttyplot is dropping support of these legacy systems, then I suggest we use pselect()
instead. This function is in POSIX since 2000, and is meant to solve the very same race addressed by the self-pipe trick... in a less hacky way.
I'm at it but it takes more time, stay tuned :smiley:
@tenox7 @edgar-bonet I have replaced this with a version without threads now, even without select, without self-pipes and without any races that I would already know about. You'll see that the complexity that we saved in threads is now in the non-blocking I/O code, because we need to collect the input to sscanf
and that's not trivial, as you'll see. I'll convert this to a draft now just so that it doesn't get merged before we all had a good chance to test and maybe sleep over this approach more also. I'm looking forward to merciless but constructive review, I'm very curious what you think, this has been an interesting ride :smiley:
Looks cool, but could you describe in code comments, or here why is this necessary? I mean finding rows/eding/pushing back data etc? Is it to do with non blocking stdin?
This version never sleeps! At least not until stdin gets closed: it busy-waits for data, keeping one CPU core at 100%.
A pselect()
-based main loop would less wasteful of CPU and energy.
Looks cool, but could you describe in code comments, or here why is this necessary? I mean finding rows/eding/pushing back data etc? Is it to do with non blocking stdin?
@tenox7 the short answer would be that we need to determine how much text to feed to sscanf
some way. I'll need to sleep over this.
This version never sleeps! At least not until stdin gets closed: it busy-waits for data, keeping one CPU core at 100%.
A
pselect()
-based main loop would less wasteful of CPU and energy.
@edgar-bonet that's an important point, will adjust, thanks! :+1:
If that helps, I wrote a toy program to see how we could watch for SIGWINCH
, SIGINT
and stdin
readiness with a pselect()
loop:
That could be the skeleton of the program. In fact, the main loop could almost be used as-is, provided the dummy handle_resize()
and handle_input()
are replaced with the actual useful stuff.
Thanks @edgar-bonet, this looks familar but it's doing blocking reads from stdin once pselect
found a single byte, correct? I wonder if blocking read
and/or blocking scanf
are really lossless despite being interrupted by signals — maybe I'm too pessimistic :smiley: If they are, maybe I can drop the whole non-blocking complexity again. What is your understanding?
@hartwork wrote:
it's doing blocking reads from stdin once
pselect
found a single byte, correct?
This code reads from stdin only when pselect()
says it is ready for reading. According to the select(2) manual (emphasis mine):
A file descriptor is ready for reading if a read operation will not block
Note that read()
will only read as many bytes as are currently available on the pipe. Note also that, if you call read()
twice, the second call will most likely block.
I wonder if blocking
read
and/or blockingscanf
are really lossless despite being interrupted by signals
read()
will not be interrupted: signals are blocked all the time except within the pselect()
call.
Note that you cannot use scanf()
, as this function calls read()
as many times as needed to get the requested data, and only the first call is guaranteed to not block.
maybe I'm too pessimistic 😃
You may fcntl()
stdin into non-blocking mode if you fear pslect()
may fail to honour its contract.
maybe I can drop the whole non-blocking complexity again.
My understanding is that this complexity is mostly a state machine meant to read stdin by small bites, and tell when a data point is ready to be parsed. This would still be useful with a pselect()
approach, precisely because we can only do a single read()
at a time.
Wait! I noticed my example code misbehaves when stdin is closed. I'll go fix that...
I wrote:
I noticed my example code misbehaves when stdin is closed. I'll go fix that...
Fixed by the use of a reading_input
flag. I edited the code in my previous comment.
it's doing blocking reads from stdin once
pselect
found a single byte, correct?This code reads from stdin only when
pselect()
says it is ready for reading. According to the select(2) manual (emphasis mine):A file descriptor is ready for reading if a read operation will not block
Note that
read()
will only read as many bytes as are currently available on the pipe. Note also that, if you callread()
twice, the second call will most likely block.
@edgar-bonet good points!
I wonder if blocking
read
and/or blockingscanf
are really lossless despite being interrupted by signals
read()
will not be interrupted: signals are blocked all the time except within thepselect()
call.
I need to think more about that.
Note that you cannot use
scanf()
, as this function callsread()
as many times as needed to get the requested data, and only the first call is guaranteed to not block.
Good point!
maybe I'm too pessimistic 😃
You may
fcntl()
stdin into non-blocking mode if you fearpslect()
may fail to honour its contract.
Maybe let's trying blocking plus pselect
first.
maybe I can drop the whole non-blocking complexity again.
My understanding is that this complexity is mostly a state machine meant to read stdin by small bites, and tell when a data point is ready to be parsed. This would still be useful with a
pselect()
approach, precisely because we can only do a singleread()
at a time.
Sounds good!
I noticed my example code misbehaves when stdin is closed. I'll go fix that...
Fixed by the use of a
reading_input
flag. I edited the code in my previous comment.
Thanks!
I'll go work on the next iteration…
I have been considering this non-blocking approach since yesterday. Here are some extra thoughts:
You may think of the main loop as an event loop. It is structured like this pseudo-code:
for (;;) {
wait_for_next_event();
if (got_input) handle_input();
if (got_sigwinch) handle_sigwinch();
if (got_sigint) handle_sigint();
}
where wait_for_next_event()
is actually pselect()
. If a SIGWINCH
or SIGINT
arrives while we are sleeping on pselect()
, the signal is delivered, the handler runs, and pselect()
returns immediately. This is how we catch those events. If the signal arrives at any other time, it cannot be delivered (as it has been blocked by sigprocmask()
): it is therefore a pending signal. On the next loop iteration, as pselect()
unblocks the signal, it gets delivered immediately, and we catch it as before. Having these signals delivered only within pselect()
makes the loop structure simpler and easier to reason about. It also prevents the race condition I mentioned before.
The data producer is unlikely to write()
to the pipe one byte at a time: it is far more likely that at least one data point will be provided each time. We then do not need to optimize for the one-byte-at-a-time scenario. We may drop the state machine and simply attempt to sscanf()
after each read()
. If sscanf()
fails, no worries: we keep in the buffer the bytes we read so far, and return to the event loop – we will sscanf()
again after our next read()
.
On quasi-pseudo-code (omitted: error checking, edge cases, variable declarations...):
static void handle_input()
{
static char buffer[1024]; // important that it's static
static int buffer_pos;
read(STDIN_FILENO, buffer + buffer_pos, room_available);
buffer_pos += bytes_read;
buffer[buffer_pos] = '\0';
for (;;) {
if (channels_to_plot == 2)
matched = sscanf(buffer, "%lf %lf[ \t\r\n]%n",
&value1, &value2, &bytes_consumed)
else
matched = sscanf(buffer, "%lf[ \t\r\n]%n",
&value1, &bytes_consumed)
if (matched < expected) break; // sscanf() failed to read the expected data
save_values_to_ring_buffers();
shift_the_input_buffer(bytes_consumed);
got_at_least_one_data_point = true;
}
if (got_at_least_one_data_point) paint_plot();
}
Note that the sscanf()
format ends with white space: this is meant to guarantee that we got the full numeric value.
The data producer is unlikely to
write()
to the pipe one byte at a time: it is far more likely that at least one data point will be provided each time.
@edgar-bonet I think that's missing the case where you're driving the input while typing into the window, e.g. running ttyplot -2
and then typing 1<space>2<space>
into the window. That will not arrive in one piece. EDIT: As a result, bytes_consumed
and %n
will be at risk to treat input "1.23" as "1.2" and "3" rather than a single double. That would be a problem.
@edgar-bonet I have pushed my current code now to allow review and testing of the third generation code in detail. The two most important things to say about it:
select
rather than pselect
for the moment, not because I want to avoid pselect
but because with a timeout both seem to not stop blocking on signal arrival but wait until timeout for me (which is unexpected) at the moment and I haven't yet figured out why. I need to look into signal masks and blocking more. The current version looks like working to the user, but the signal setup still needs improvement, the rest I am not unhappy about. If you see what I'm missing, would be great to understand, will play with it more myself also.@edgar-bonet PS:
select
bug in action I recommend increasing the timeval
timeout to more than a second. That will make the effect easy to see. Still not sure about reasons.@edgar-bonet I think I fixed it now: it got interrupted but lacked a redraw and jumped right back into the next select
. I'm happy to further improve on any issues you find :pray:
Note that the
sscanf()
format ends with white space: this is meant to guarantee that we got the full numeric value.
@edgar-bonet only saw that now. It's a nice idea but:
echo -n '1 2 3' | ./ttyplot
where there is no trailing whitespace?[..]
nonempty wrap, and bytes_consumed
always ends up 0 (see below), with or without whitespace present.$ cat main.c
#include <stdio.h>
int main(int argc, char ** argv) {
double d1 = 0.0;
double d2 = 0.0;
int bytes_consumed = 0;
int res = sscanf(argv[1], "%lf %lf[ \t\r\n]%n", &d1, &d2, &bytes_consumed);
printf("res=%d, d1=%f, d2=%f, bytes_consumed=%d, argv[1]=\"%s\"\n", res, d1, d2, bytes_consumed, argv[1]);
return 0;
}
$ gcc main.c && ./a.out '1 2'
res=2, d1=1.000000, d2=2.000000, bytes_consumed=0, argv[1]="1 2"
$ gcc main.c && ./a.out '1 2 '
res=2, d1=1.000000, d2=2.000000, bytes_consumed=0, argv[1]="1 2 "
@edgar-bonet update: I found a way to make %n
work but it requires POSIX.1-2008 according to man sscanf
:
$ cat main.c
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
double d1 = 0.0;
double d2 = 0.0;
int consumed1 = 0;
int consumed2 = 0;
char *dummy = NULL;
int res = sscanf(argv[1], "%lf %lf%n%m[ \t\r\n]%n", &d1, &d2, &consumed1,
&dummy, &consumed2); // "m" is POSIX.1-2008
printf("res=%d, d1=%f, d2=%f, consumed1=%d, consumed2=%d, argv[1]=\"%s\"\n",
res, d1, d2, consumed1, consumed2, argv[1]);
free(dummy);
return 0;
}
$ gcc main.c && ./a.out '1 2 '
res=3, d1=1.000000, d2=2.000000, consumed1=3, consumed2=5, argv[1]="1 2 "
@hartwork:
would it work for cases like
echo -n '1 2 3' | ./ttyplot
where there is no trailing whitespace?
No, it wouldn't. However, in Unix any valid text stream ends with '\n'
.
the trailing whitespace does not seem to be required in practice despite the
[..]
nonempty wrap, andbytes_consumed
always ends up 0 (see below), with or without whitespace present.
My bad. [...]
should be replaced by %*[...]
, and the test for a failed match should be bytes_consumed == 0
.
I have pushed my current code now to allow review and testing of the third generation code in detail.
Thanks. I noticed that, if the input stream is closed, this version starts consuming 100% CPU.
I'm using
select
rather thanpselect
for the moment, not because I want to avoidpselect
but because with a timeout both seem to not stop blocking on signal arrival but wait until timeout for me (which is unexpected)
Weird. The toy program I posted above does not suffer from this syndrome: I added a ten second timeout, and it still responds immediately to SIGWINCH
, SIGINT
and input events.
the signal setup still needs improvement
It suffers from the race condition I mentioned earlier. If a SIGWINCH
is delivered after the test if (sigwinch_pending)
(line 424 on commit 53fa0be87aa03c73d2d9fa6848e0893c25a1c49e) but before the call to select()
(l. 450), it will not interrupt select()
, which can sleep for 0.5 s. Then, signal_received
will be false
and we won't redraw until the next loop iteration.
Hi @edgar-bonet, thanks for your reply!
would it work for cases like
echo -n '1 2 3' | ./ttyplot
where there is no trailing whitespace?No, it wouldn't. However, in Unix any valid text stream ends with
'\n'
.
I think I would rather support even say piping in a text file that lacks a trailing new line. I would personally always fix files to end in a trailing newline — pre-commit's end-of-file-fixer
is very nice — but I'd rather be supportive and robust about it, no need to lose the last record.
the trailing whitespace does not seem to be required in practice despite the
[..]
nonempty wrap, andbytes_consumed
always ends up 0 (see below), with or without whitespace present.My bad.
[...]
should be replaced by%*[...]
, and the test for a failed match should bebytes_consumed == 0
.
Very nice, "%lf %lf%*[ \t\r\n]%n"
seems to work as expected, does not require m
, does not need to malloc. Over night I figured that with a known maximum input length of 4096 I could have avoided m
knowing that %[ \t\r\n]
will never store than 4096 bytes in the first place, but not storing at all using %*
seems best :+1:
I'll at least try to integrate that approach to see how it feels compared to status quo. I don't like the cryptic past-time feel of scanf
but to not have two parts of the code that could potentially go out of sync over time and may already be out of sync in some unknown way seems like a clear win :+1:
I have pushed my current code now to allow review and testing of the third generation code in detail.
Thanks. I noticed that, if the input stream is closed, this version starts consuming 100% CPU.
Interesting! I could use pselect
with nfds=0
and the current timeout similar to what you did at https://github.com/tenox7/ttyplot/pull/98#issuecomment-1814500197 for a portable sub-second sleep, as mention under "Emulating usleep(3)" in man select
. Without a clock display, just pause
would have done. (We'd could also send SIGALARM
us just for the clock, but the related code would be >=POSIX.1-2008 again apparently.)
It suffers from the race condition I mentioned earlier. If a
SIGWINCH
is delivered after the testif (sigwinch_pending)
(line 424 on commit 53fa0be) but before the call toselect()
(l. 450), it will not interruptselect()
, which can sleep for 0.5 s. Then,signal_received
will befalse
and we won't redraw until the next loop iteration.
I do not yet see how pselect
would perform any different from plain select
in that very scenario. Can you help me understand? I'll try add sleeping (only locally for debugging) to give the race a better chance to show and see if pselect
does any different, with some luck, but I expect the same for the scenario that you described. I have a feeling I am missing something, the whole race scenario is still a bit unclear (while race conditions in general are clear). The text in the man page seems to be written for readers that already understand the problem rather than some that don't.
I'll go work on the next iteration, happy about more insights from the land of signals from you in the meantime.
I noticed that, if the input stream is closed, this version starts consuming 100% CPU.
@edgar-bonet ^^ that's fixed now, pushed.
@edgar-bonet fifth(?) generation pushed and ready for review. It has all known issues fixed except potential use of pselect
.
Without a clock display, just
pause
would have done.
pause()
would be a viable option. However, we would then end up with two separate event loops: one pselect()
-based for when there is data, and a second one, pause()
-based, for when the stream is closed. This would add unneeded complexity to the structure of the program. In contrast, a single event loop is a standard programming idiom simple to understand: repeat (forever) { catch an event; handle the event; }.
Regarding the clock display: On master, it is only updated on data or SIGWINCH
events. Adding a periodic update is a nice extra feature, but wouldn't that be off-topic on this PR?
I do not yet see how
pselect
would perform any different from plainselect
in that very scenario.
See the toy program above: SIGWINCH
and SIGINT
are blocked during the whole execution of the program except inside the call to pselect()
. This ensures that these signals are always delivered to (and caught by) pselect()
. All events (signals and data) are then collected at a single point in the program. Plain select()
cannot do that other than with the self-pipe trick.
Without a clock display, just
pause
would have done.
pause()
would be a viable option. However, we would then end up with two separate event loops: onepselect()
-based for when there is data, and a second one,pause()
-based, for when the stream is closed. This would add unneeded complexity to the structure of the program. In contrast, a single event loop is a standard programming idiom simple to understand: repeat (forever) { catch an event; handle the event; }.
@edgar-bonet if it was something like…
if (stdin_is_open)
....... = pselect(...);
else
pause();
…then it would still feel like a single loop, it could even be extracted into a function. Just to add what seemed missing in the picture, I think a question of personal preference maybe, both should be fine.
Regarding the clock display: On master, it is only updated on data or
SIGWINCH
events. Adding a periodic update is a nice extra feature, but wouldn't that be off-topic on this PR?
I could re-add that bug to the pull request and then make a follow-up pull request only to unbreak the breakage. I understand that this pull request could be multiple pull requests now but the pieces are heavily interleaved, splitting them up will be quote some work and risks breaking things that are finally working, and our current merge frequency will also make it take way longer to get it merged with all the ping pong involved.
I do not yet see how
pselect
would perform any different from plainselect
in that very scenario.See the toy program above:
SIGWINCH
andSIGINT
are blocked during the whole execution of the program except inside the call topselect()
. This ensures that these signals are always delivered to (and caught by)pselect()
. All events (signals and data) are then collected at a single point in the program.
Understood. Using pselect
now, pushed.
Plain
select()
cannot do that other than with the self-pipe trick.
That I do not understand. If we used select
like this:
while (true) {
if (sigint_pending) {
break;
}
...
pthread_sigmask(SIG_SETMASK, &empty_mask, &origmask);
ready = select(nfds, &readfds, NULL, NULL, &timeout);
pthread_sigmask(SIG_SETMASK, &empty_mask, NULL);
if ((ready == -1)) && (errno == EINTR)) {
continue;
}
...
}
Where would the race be and what would would be the effect of that race? Thanks in advance!
@tenox7 I believe this code is ready for your review and testing and checking against your requirements for a merge now. Thanks in advance!
if it was something like…
if (stdin_is_open) ....... = pselect(...); else pause();
…then you would have a race if the signal is delivered right before pause()
.
[Regarding the clock display] the pieces are heavily interleaved
I didn't notice they were heavily interleaved. If they indeed are, then it does make sense to keep them together.
If we used select like this: [...] Where would the race be and what would be the effect of that race?
while (true) {
if (sigint_pending) {
break;
}
/* === race here === */
pthread_sigmask(SIG_SETMASK, &empty_mask, &origmask);
ready = select(nfds, &readfds, NULL, NULL, &timeout);
pthread_sigmask(SIG_SETMASK, &origmask, NULL);
if ((ready == -1)) && (errno == EINTR)) {
continue;
}
...
}
Imagine SIGINT
arrives at the place marked race here
. As it is presumably blocked, it will be a pending signal at this point. The first call to pthread_sigmask()
unblocks the signal, which is then delivered right away (before pthread_sigmask()
even returns). Then, select()
will not be interrupted by the signal (which has already been delivered), and it may sleep for a long time, limited only by the timeout. It will not return EINTR
.
while (true) { if (sigint_pending) { break; } /* === race here === */ pthread_sigmask(SIG_SETMASK, &empty_mask, &origmask); ready = select(nfds, &readfds, NULL, NULL, &timeout); pthread_sigmask(SIG_SETMASK, &origmask, NULL); if ((ready == -1)) && (errno == EINTR)) { continue; } ... }
Imagine
SIGINT
arrives at the place markedrace here
. As it is presumably blocked, it will be a pending signal at this point. The first call topthread_sigmask()
unblocks the signal, which is then delivered right away (beforepthread_sigmask()
even returns). Then,select()
will not be interrupted by the signal (which has already been delivered), and it may sleep for a long time, limited only by the timeout. It will not returnEINTR
.
@edgar-bonet so with plain select
it takes one more round of waiting before we hit if (sigint_pending) {
again whereas pselect
in contrast would return right away and let us get to the if (sigint_pending) {
again quicker. Okay, thanks! :pray:
The code looks sane to me, free of race conditions. I would just suggest a more straightforward main loop structure.
Regarding the signal handling, as I read it, the global structure of the main loop looks like this:
while (1) {
if (got_sigint) handle_sigint();
if (got_sigwinch) handle_sigwinch();
wait_for_next_event();
if (got_either_signal) continue;
if (got_input) handle_input();
}
This should work just fine. However, there is no benefit in making it more complicated than it needs to be. If you move the signal handling just below pselect()
, you end up with a more straightforward structure:
while (1) {
wait_for_next_event();
if (got_sigint) handle_sigint();
if (got_sigwinch) handle_sigwinch();
if (got_input) handle_input();
}
Regarding the data handling, the structure looks like this (signal handling omitted):
while (1) {
if (!could_parse_a_data_point) wait_for_next_event();
if (got_input) read_input();
try_to_parse_one_data_point();
}
Again, this should work fine, but I find it a bit convoluted: if a single read()
provides multiple data points, they will be parsed on multiple iterations of the main loop. Is there a reason for proceeding this way? IMO, it would be more straightforward to fully handle the input event on the current loop iteration:
while (1) {
wait_for_next_event();
if (got_input) {
read_input();
do {
try_to_parse_one_data_point();
} while (could_parse_a_data_point);
}
}
The code looks sane to me, free of race conditions.
@edgar-bonet cool, thanks for the review! :pray:
I would just suggest a more straightforward main loop structure. [..]
These are both good ideas. For the second I should note that I only did not go for it to keep the diff smaller and to not further grow the pull request.
Right now, I have no idea how @tenox7 feels about the current approach, I hope he likes it. If this however would get rejected by him for something else, my time refactoring may turn out wasted, it's never just five minutes and it needs good care to not introduce new breakage, so I'd like to hear back from him first, to not put more time on a uncertain line. I'm happy to get back to these ideas, either here or in follow-up pull requests.
I think I like it better than pthreads. It adds a lot of complexity, but perhaps is is unavoidable at this point?
@tenox7: This could eventually be refactored to get the complexity more manageable. Just as PR #106 gets 24 lines of code out of the main loop, the handing and parsing of input (a large part of this PR) could well be moved into its own function(s). Maybe even simplified a bit (not sure, didn't try). If this gets merged, I am willing to attempt such refactoring.
@edgar-bonet @hartwork first of all thanks for all your contributions!
With all these changes, I'm thinking that, perhaps instead of merging this to master one by one, lets create some experimental or 2.0 branch or something like that. Then PR everything there instead of master. Once everyone is happy and well tested, we can merge to master as a final step.
@tenox7 wrote:
[...] lets create some experimental or 2.0 branch or something like that. Then PR everything there instead of master [...]
Seems fine with me. I think this PR, as it is the largest one, should be the first one to be rebased/retargeted to that new branch. I will then rebase and retarget mine.
@edgar-bonet @hartwork first of all thanks for all your contributions!
@tenox7 you're welcome!
I think I like it better than pthreads. It adds a lot of complexity, but perhaps is is unavoidable at this point?
I see two main sources for the increase in complexity, caused by two key assumptions/decisions/requirements. The key assumptions are:
scanf
because:
read
, I believe.On a side note quoting man scanf
:
It is very difficult to use these functions correctly, and it is preferable to read entire lines with
fgets
(3) orgetline
(3) and parse them later withsscanf
(3) or more specialized functions such asstrtol
(3).
As a consequence of the former — no threads —, a single thread has to do all of these things:
As a consequence of the latter — no scanf
—, we now have sscanf
(with two "s"), non-blocking I/O with read
and pselect
, a line buffer to manage parsing input, and code to jump over input that would stuck sscanf
otherwise.
Personally, I feel like going simpler would give up something of value and that we have now arrived at something good enough to merge.
With all these changes, I'm thinking that, perhaps instead of merging this to master one by one, lets create some experimental or 2.0 branch or something like that. Then PR everything there instead of master. Once everyone is happy and well tested, we can merge to master as a final step.
I would support this idea if we had more big risky features still in the pipeline, bigger display changes, API incompatible changes — things like that. But what we have here is now complete in itself (besides potential refactoring follow-ups), may not be noticed by most users even, and is just one in a group of independent changes, rather than a big story, other than increasing robustness maybe. Personally, I like to keep master
in releasable shape and branches short-lived: merging this branch keeps master
releasable in my view. If it does feel too risky to merge still, let's maybe sleep over it and try to identify that feeling of risk and what it is about. Then we can do something about it if necessary. @tenox7 what do you think?
@tenox7: If you want to optimize the testing effort, I could build a branch that merges all the currently-pending pull requests. Would that help?
I wrote:
I could build a branch that merges all the currently-pending pull requests.
Well, I just did it: edgar-bonet:next.
$ git -P log --first-parent master..next
commit 2c69fb2f09ff836d07657a5bee5500fb524505c2 (HEAD -> next)
Merge: 333dd6a 98ae048
Author: Edgar Bonet <linux@edgar-bonet.org>
Date: Thu Nov 23 23:04:27 2023 +0100
Merge pull request #99 from edgar-bonet/multibyte
* Define _XOPEN_SOURCE_EXTENDED on macOS
* Always use ASCII characters for the arrow tips
* Rely on pkg-config to find the proper ncurses.h
* Support plotting with mutibyte characters
commit 333dd6ab507d350a984ee16fed2744d3d3142a00
Merge: c72e7d5 2176376
Author: Edgar Bonet <linux@edgar-bonet.org>
Date: Thu Nov 23 22:55:33 2023 +0100
Merge pull request #106 from edgar-bonet/rate
* Increase the time resolution in rate mode
commit c72e7d5884a2f262862563209f5884008cd8dfe6
Merge: 25c0ab0 ceb0bda
Author: Edgar Bonet <linux@edgar-bonet.org>
Date: Thu Nov 23 22:48:42 2023 +0100
Merge pull request #110 from edgar-bonet/keys
* Stop watching the terminal if it gets closed
* Toggle "rate mode" when the user types 'r'
* Quit when the user types 'q'
commit 25c0ab001f1af2afa7d7a21bd4e4e8c50ca1219b
Merge: 07ce2f9 c150198
Author: Edgar Bonet <linux@edgar-bonet.org>
Date: Thu Nov 23 22:47:03 2023 +0100
Merge pull request #98 from hartwork/signal-handling
* Get signal handling into the land of defined behavior
* Delay display of stochastics until we have actual data
Thanks for your software adjustments.
Hi!
The current signal handling code is not robust and undefined behavior (according to https://wiki.sei.cmu.edu/confluence/display/c/SIG31-C.+Do+not+access+shared+objects+in+signal+handlers).
There are many ways to address this problem, and I'd like to present one approach here for review.
I saw only after coding this that use of pthreads was said to be not welcome in #62 to support systems without pthreads. If that's still status quo then the only threadless alternative will likely be to do fully non-blocking stdin I/O and I just learned things while building for the trashcan… Status quo on
master
really needs some fix, it's not robust.These decisions went into the current implementation:
volatile sig_atomic_t
and the actually handling is done elsewhere.Let me know what you think, best, Sebastian
Depends on #94 for a fully green CI.
Fixes #62