sagemath / cysignals

cysignals: interrupt and signal handling for Cython. Source repository for https://pypi.org/project/cysignals/
GNU Lesser General Public License v3.0
43 stars 23 forks source link

Fix a doctest failure which triggers in i686. #162

Closed tornaria closed 2 years ago

tornaria commented 2 years ago

This failure is 100% reproducible on void linux i686 (glibc 2.32).

The example is in the function test_bad_str() in the file tests.pyx. The test pases a bad string to sig_str() and then raises SIGILL. The signal handler eventually raises a Python exception which in turn raises a SIGSEGV when accessing the bad string. An error message is expected, but that doesn't happen.

Presumably the segfault happens inside some stdio function which leaves stdio buffers in an inconsistent state so the latter fprintf doesn't work properly. From signal-safety(7):

Suppose that the main program is in the middle of a call to a
stdio function such as printf(3) where the buffer and associated
variables have been partially updated.  If, at that moment, the
program is interrupted by a signal handler that also calls
printf(3), then the second call to printf(3) will operate on
inconsistent data, with unpredictable results.

We fix this by replacing the fprintf by calls to write, which is async-signal-safe according to POSIX.

tornaria commented 2 years ago

It might be convenient to change all uses of fprintf() inside the signal handler by write(); this is the minimal change required to pass all current doctests.

dimpase commented 2 years ago

There are more fprintfs in this file, did you check that the rest are harmless?

tornaria commented 2 years ago

There are more fprintfs in this file, did you check that the rest are harmless?

No, I just patched out the one that was causing the bad doctest for me.

If you are ok with this change, I can go and look for other instances inside the signal handlers. At least the ones that are compiled when debug is disabled (the ones for debug tend to have more parameters so they are more work to translate, although it can also be done).

I don't know how to actually force the failure; it just happens with my particular version of python + glibc on 32 bit, and I wouldn't know how to make other instances fail. Also, it's kind of weird that other fprintf calls inserted here and there might randomly fix the issue as internal stdio state gets somehow cleared (moreover, gcc changes fprintf without optional arguments to fwrite which doesn't seem to have the problem, although the standard still forbids it).

Maybe it's possible to replace fprintf by a stub that checks and warns on reentrancy.

kliem commented 2 years ago

Thanks for the changes, I bumped the version number and added a note in the README.