Python buffer not flushed on SIGTERM

lukas-vlcek commented 7 years ago

If watches output is forwarder to a file, like

watches cluster_health -l -d 10 -i 1 > test.log &

and is killed using kill -s TERM ${watches_pid} then the internal Python buffer does not seem to be flushed. The data that hasn't been flushed yet is dropped on the floor and lost.

For example pbench seem to be killing data collection tools this way ATM, see kvm-spinlock script.

Is there any reliable way how to flush buffer in this case? We can catch some signals in Python, see how-to-process-sigterm-signal-gracefully. Can we also flush the buffer in this case?

richm commented 7 years ago

How do other commands invoked from pbench work? Do they have some SIGTERM handler?

lukas-vlcek commented 7 years ago

From what I have seen they do not care. But maybe they do not have to because they do not keep long running connection to some service. Most of the datalog scripts just do simple loop:

while true; do
  call_some_tool >> ${file}
  sleep ${interval}`
done

Alternatively we can try pbench use SIGQUIT instead of SIGTERM for watches (pbench seem to enbale this already), I tested it and it seem to flush the data in such case. But we need to check with pbench folks first if it is ok go this path.

In practice, it is probably not a big deal to lose some of the data in the end of the test for pbench folks. But I thought it would be nice to handle this case correctly if we can.

ViaQ / watches-cli

Python buffer not flushed on SIGTERM #25