Closed arvidj closed 2 years ago
Is part of your testing framework linked with the code under test? If so, you could link in a SIGTERM handler that calls dump_counters_exn
followed by reset_counters
:
https://github.com/aantron/bisect_ppx/blob/c632ed972a80a4dd32c0ae8a993303160c1c4f26/src/runtime/native/runtime.mli#L66-L70 https://github.com/aantron/bisect_ppx/blob/c632ed972a80a4dd32c0ae8a993303160c1c4f26/src/runtime/native/runtime.mli#L72-L73
AFAIK we can't make the at_exit
hook considerably more atomic in the face of signals. We could mask SIGTERM, but as in #122, I worry about that interfering with the user's program excessively.
For when the framework is not linked into the code under test, we could add a flag to the PPX that allows the generated code to install signal handlers or mask signals. For example,
(instrumentation
(backend bisect_ppx --sigterm --mask))
...might cause the runtime to install a SIGTERM handler that will write coverage data, and also mask signals during the coverage writing hook.
However, I think it would be best if the testing framework could take care of that. It seems to me that even for blackbox testing, you could link something into your binaries under test that will install the right signal handlers for your testing environment.
bisect-ppx-report
should be made more resistant to corrupted coverage files. I'll look into this shortly, and I opened #385 to track it. I think the combination of a signal handler and a robust bisect-ppx-report
should solve 2, since the signal handler will interrupt the at_exit
hook and write the coverage statistics to a different file. The file being written by the at_exit
hook will become corrupt, but an improved bisect-ppx-report
will know to ignore it.
In one of our tests touched by this issue, I've worked around the problem by handling signals explicitly in the instrumented process: https://gitlab.com/tezos/tezos/-/merge_requests/3792/
I think this is the way to go for tests that launch their own processes where the code of the child processes is "part of the test itself". For tests that launch separate binaries, I think one should:
at_exit
hook on termination I agree with you assessment that it would be messy to have bisect mix with the instrumented process's signal handling by default. Nonetheless, I'm afraid we'll have to add such a thing due to potentially large number of binaries we have in our project that do not handle signals at all (or improperly) (the second point above). We might not be able to go through and correct them on a case-by-case basis. Instead, it'd be good to have a blanket solution in the form of the --mask
and --sigterm
options you evoke above. I've thought a bit on how to implement those options in bisect here. I'd be happy for your thoughts before I attempt an implementation.
Another problem is that we have a fair amount of tests and a fair amount of different binaries. Going through them all and making sure coverage data is written for each launched process is infeasible to do by hand. I touch upon a solution in the issue I link above. What if instrumented binaries create an empty coverage file at initialization, which is filled at termination? In this way, we would have a clear marker (an empty file) indicating that something "should've been" written.
Thanks! I've read your proposal. The file lock idea seems like it would have a slightly complicated interaction with the runtime's ability to write .coverage
files on demand.
We might be able to make this simpler by just having both at_exit
and the signal handler blindly write .coverage
files. The file names are random, so there should be no collision between the two writing "threads." Before at_exit
starts writing a file, it can store its name in a ref
. The signal handler can use the ref
to delete at_exit
's file, to prevent visit double counting by the reporter.
Going through them all and making sure coverage data is written for each launched process is infeasible to do by hand. I touch upon a solution in the issue I link above. What if instrumented binaries create an empty coverage file at initialization, which is filled at termination? In this way, we would have a clear marker (an empty file) indicating that something "should've been" written.
Could having the Bisect runtime handle SIGTERM make coverage writing reliable enough to make this unnecessary?
If not, we can probably have the runtime create some empty files that can be counted, and compared with the .coverage
files — a variation of what you are proposing.
The other thing to note is that --mask
is intended to turn off signal handling completely during at_exit
, in which case it should be unnecessary to have any kind of locking between the signal handler and at_exit
.
(--mask
should probably just be part of --sigterm
, since --sigterm
would then only be safe with signals masked during at_exit
)
We might be able to make this simpler by just having both at_exit and the signal handler blindly write .coverage files. The file names are random, so there should be no collision between the two writing "threads." Before at_exit starts writing a file, it can store its name in a ref. The signal handler can use the ref to delete at_exit's file, to prevent visit double counting by the reporter. We might be able to make this simpler by just having both at_exit and the signal handler blindly write .coverage files. The file names are random, so there should be no collision between the two writing "threads." Before at_exit starts writing a file, it can store its name in a ref. The signal handler can use the ref to delete at_exit's file, to prevent visit double counting by the reporter.
I think three different scenarios are possible:
at_exit
(in this case at_exit
is never called), or vice versa. (1a and 1b below)at_exit
starts executing inside the signal handler (scenarios 2a and 2b below). at_exit
Scenario 1a:
`at_exit` : <--->
signal handler : never runs
Scenario 1b:
`at_exit` : never runs
signal handler : <--->
Scenario 2a:
`at_exit` : <------>
^- writes the ref
signal handler : <-------------->
^ removes the ref
Scenario 2b:
`at_exit` : <-------------->
^- writes the ref
signal handler : <-------------->
^- finds nothing to remove
Scenario 3a:
`at_exit` : <-------------->
^ masks signal
signal received: |
signal handler : (masked)
Scenario 3b:
`at_exit` : <-------------->
^ masks signal, writes the ref
signal received: |
signal handler : <-------->
Scenario 1 should pose no issues.
In scenario 2, at_exit
will write it's coverage to some file and
store a ref to the file. If it has finished doing before the
termination of the signal handler, then the signal handler can remove
that file (2a above). However, if at_exit
is slower than the signal
handler (2b above), then will the signal handler know that there
is a file to remove?
If the signal is recieved after the start of at_exit
, then the
signal handler should be inhibited by the masking (3a above). But, is
the following scenario is possible (3b above): 1) at_exit
starts 2)
signal received 3) at_exit
masks?
AFAIK scenarios 2a and 2b are impossible — at_exit
cannot start running during the signal handler (unless somehow called or directly triggered by the signal handler itself). The signal handler would have to finish in order for the "main process" to resume and trigger at_exit
in any other way. The signal handler "takes over" the main thread, so once it starts running, in the case of the --sigterm
handler, it is guaranteed to write the .coverage
file and exit the process without triggering any "other" at_exit
.
Scenario 3b is impossible (or at least not dangerous) if at_exit
first masks SIGTERM before doing any of its other work. Then, if SIGTERM races in before the filename is chosen or the file is created, the SIGTERM will take over the thread and the thread will never return to at_exit
anyway. If SIGTERM is too late, it is already masked, and at_exit
will finish.
When choosing a filename, it's probably important to first write it to the ref
, and then try to open/create the file. The SIGTERM should then attempt to remove the file named in the ref
, and proceed silently if the file turns out not to exist.
Actually, we probably don't even need the ref
at all, if we do
if
at_exit
first masks SIGTERM before doing any of its other work. Then, if SIGTERM races in before the filename is chosen or the file is created, the SIGTERM will take over the thread and the thread will never return toat_exit
anyway. If SIGTERM is too late, it is already masked, andat_exit
will finish.
Since SIGTERM will either come first, or will be masked during the entire process of choosing a filename and writing the file.
Yes, I agree with your assessment. I'll do some experiments with the approach you suggest.
If not, we can probably have the runtime create some empty files that can be counted, and compared with the .coverage files — a variation of what you are proposing.
I tried this, and it turns out to be complicated. As we typically fork in our instrumented binaries, there is no set "starting point" of each instrumented process. When a binary forks then we want to have two coverage files (and that is what bisect will give us unless any of the processes are terminated abruptly). However, it's hard to make both the parent and child process leave some kind of trace that allows us to verify that the correct number of coverage files were written.
Probably addressed (pending full-scale testing :P) by @arvidj in #390, so closing for now. Thanks!
In our system testing framework, we launch processes instrumented with bisect. Normally, these processes terminate normally, writing coverage data as expected. However, some tests may leave unterminated processes, at which point the framework terminates all dangling processes through
SIGTERM
. Then, one of two things may happen:exit
), the process will terminate without writing coverage dataIn the first case we're getting an incomplete coverage measure (as the unterminated process has indeen been tested even though not cleanly terminated). In second, we have the same situation but in addition,
bisect-ppx-report
throws exceptions.I don't see any simple solution to this problem (except making sure processes are terminated more gracefully in our systems testing framework), so consider this more of a discussion issue. It seems relatively simple to add a signal handler dealing with the first case, but as you note in https://github.com/aantron/bisect_ppx/issues/122:
For the second case, perhaps the
at_exit
hook can be made more atomic?