Closed nirs closed 10 years ago
Signal mask needs to be up as soon as possible as the thread can interfere with signal handling otherwise. The most common and annoying problem I had was with signalfd where signals weren't delivered to the fd (cause they were sent to the active thread instead).
For example, if I were to use signalfd in the child, this change would create the aforementioned problem. I think users can just disable patch_fork
if they care so much about a clean signalmask.
The new thread will not receive any signal from other process - Python guarantee that by deferring signals to the main thread.
Can you describe the flow when setting up signalfd is required before waiting for re-installation?
Ok, so, signalfd is a linux specific feature, you get a fd on which you receive signals you block with sigprocmask.
Python does defer the signals to the main thread in a very weird way: whatever thread gets the signal will set a global flag which changes the behaviour of the python runtime. Now the python runtime will make all the threads yield the GIL till the main thread gets it in other to process that "deferred" signal. There's nothing at OS-level to make this work. [1]
Now because you want to receive said signals via fd, Python's signal "main thread fixup" handling doesn't help at all - you still receive the signal in the other thread if you didn't block it in that thread.
[1] David Beazley GIL presentation, slide 23: http://www.dabeaz.com/python/GIL.pdf
I meant to say, you receive the signal in the other thread instead of receiving it in the FD. Python will handle it in the main thread, ofcourse.
signalfd
is a very nice feature, it makes signal handling very nice - you can easily handle all signals in your socket event loop. See this miniature job queue for example: https://github.com/ionelmc/python-stampede
Compare this https://github.com/ionelmc/python-stampede/blob/master/src/stampede.py#L208-L215 with:
What I was asking is, what is the flow where you are using singalfd, and not blocking signals in the the manhole thread of the child process will cause this thread to receive signals that should be received in the child?
An application using signalfd will call pthread_sigmask (or sigprocmask) in the main thread, and then call signalfd to open the file descriptor for waiting for signals.
pthread_sigmask manual say:
The pthread_sigmask() function is just like sigprocmask(2), with the difference
that its use in multithreaded programs is explicitly specified by POSIX.1-2001.
sigprocmask manual say:
A child created via fork(2) inherits a copy of its parent's signal mask; the
signal mask is preserved across execve(2).
After calling this on the main thread, you will not receive any of the block signals in any thread or child process, regardless of Python magic.
So I think the code changing the signal mask in the child process seems to be unneeded.
The mask for the thread is not inherited as the threads are not inherited on fork. Every thread need to mask the signals. Makes sense now?
Thanks, -- Ionel M.
On Sat, Sep 6, 2014 at 8:19 PM, Nir Soffer notifications@github.com wrote:
What I was asking is, what is the flow where you are using singalfd, and not blocking signals in the the manhole thread of the child process will cause this thread to receive signals that should be received in the child?
An application using signalfd will call pthread_sigmask (or sigprocmask) in the main thread, and then call signalfd to open the file descriptor for waiting for signals.
pthread_sigmask manual say:
The pthread_sigmask() function is just like sigprocmask(2), with the difference that its use in multithreaded programs is explicitly specified by POSIX.1-2001.
sigprocmask manual say:
A child created via fork(2) inherits a copy of its parent's signal mask; the signal mask
is preserved across execve(2).
After calling this on the main thread, you will not receive any of the block signals in any thread or child process, regardless of Python magic.
So I think the code changing the signal mask in the child process seems to be unneeded.
— Reply to this email directly or view it on GitHub https://github.com/ionelmc/python-manhole/pull/19#issuecomment-54721251.
I used sigprocmask and pthred_sigmask on various versions of Linux, and I'm sure that threads and child process inherit the signal mask.
Maybe you are trying to support an application with does not block signals properly in the main thread, but you try to use block singals in the manhole thread using the signalfd module?
This is racy - if you don't block signals in the main thread (outside of manhole), you cannot handle signals reliably. Setting the signal mask when starting the thread is leave a window where you could loose a signal.
I think the user of manhole is responsible for setting the signalmask, before installing the manhole.
Ok, I think I didn't explain this properly.
With more detailed examples, consider a scenario A without any Python specific, imagine it's written in C:
result: signals aren't delivered reliably to the signal fd as the second thread can receive them
Scenario B:
result: signals aren't delivered reliably to the signal fd as the second thread can receive them
The point is that with the proposed change there is a 0.5 second window (or whatever you have as the wait time) where signal delivery will be unreliable.
Yes these are the flows that I described, and last time I written code doing this, the signal mask was inherited by new threads and child processes. I'll post a small test proving this :-)
Ok, check this example code: https://gist.github.com/nirs/e4be478da3f5f55116c1
When you run it, you see that signal mask is inherited from the main thread to other threads, and from parent to child, as the manual tell us.
$ python -u test.py
parent starting pid: 20615
- pid: 20615 thread: <_MainThread(MainThread, started 140436746962752)> mask: [1L, 2L, 15L]
starting a thread in parent...
- pid: 20615 thread: <Thread(Thread-1, started 140436626593536)> mask: [1L, 2L, 15L]
child starting pid: 20617...
- pid: 20617 thread: <_MainThread(MainThread, started 140436746962752)> mask: [1L, 2L, 15L]
starting a thread in child...
- pid: 20617 thread: <Thread(Thread-2, started 140436626593536)> mask: [1L, 2L, 15L]
subchild starting pid: 20619...
- pid: 20619 thread: <_MainThread(MainThread, started 140436746962752)> mask: [1L, 2L, 15L]
starting a thread in sub child...
- pid: 20619 thread: <Thread(Thread-3, started 140436626593536)> mask: [1L, 2L, 15L]
Yes, indeed. But the mask is set after the thread is started. I usually install manhole early, at import time. Do you agree now that this change can create more problems than it could solve? Setting a mask on a thread that's going to die anyway is harmless.
Thanks, -- Ionel M.
On Sat, Sep 6, 2014 at 11:43 PM, Nir Soffer notifications@github.com wrote:
Ok, check this example code: https://gist.github.com/nirs/e4be478da3f5f55116c1
When you run it, you see that signal mask is inherited from the main thread to other threads, and from parent to child, as the manual tell us.
$ python -u test.py parent starting pid: 20615
- pid: 20615 thread: <_MainThread(MainThread, started 140436746962752)> mask: [1L, 2L, 15L] starting a thread in parent...
- pid: 20615 thread: <Thread(Thread-1, started 140436626593536)> mask: [1L, 2L, 15L] child starting pid: 20617...
- pid: 20617 thread: <_MainThread(MainThread, started 140436746962752)> mask: [1L, 2L, 15L] starting a thread in child...
- pid: 20617 thread: <Thread(Thread-2, started 140436626593536)> mask: [1L, 2L, 15L] subchild starting pid: 20619...
- pid: 20619 thread: <_MainThread(MainThread, started 140436746962752)> mask: [1L, 2L, 15L] starting a thread in sub child...
- pid: 20619 thread: <Thread(Thread-3, started 140436626593536)> mask: [1L, 2L, 15L]
— Reply to this email directly or view it on GitHub https://github.com/ionelmc/python-manhole/pull/19#issuecomment-54727719.
I agree that this can change the behavior in an unwanted way if you are handling signals incorrectly.
The feature for setting the singal mask in the manhole thread in unneeded as the signal mask is inherited from the parent thread or procsess.
But since users are expecting this behavior, it would be better to keep it.
Would you like to move the wait aftet setting the signal mask?
Well yes, but it's already like that, no?
Thanks, -- Ionel M.
On Thu, Sep 11, 2014 at 3:56 PM, Nir Soffer notifications@github.com wrote:
I agree that this can change the behavior in an unwanted way if you are handling signals incorrectly.
The feature for setting the singal mask in the manhole thread in unneeded as the signal mask is inherited from the parent thread or procsess.
But since users are expecting this behavior, it would be better to keep it.
Would you like to move the wait aftet setting the signal mask?
— Reply to this email directly or view it on GitHub https://github.com/ionelmc/python-manhole/pull/19#issuecomment-55259359.
Ah yes, it it. You still want to have the reinstall_delay
- name around ? I had bind_delay
to not create wrong expectations from the option - it only delays the bind, the install is half-done at that point.
Currently the sleep is after setting the thread name, and I would like to move it before that - again, to avoid any call which is not necessary in the fork+exec flow.
About the name, I think it is too specific and will make it harder to change the implementation. I think that promising that there will be a delay before the manhole will be reinstalled after a fork is the right level of detail.
I intentionally left the name setting before the sleep as to leave little confusion about why there is this thread that quickly disappears. The goal was not to confuse people about that ephemeral thread - or at least give the right direction where to look (it's a manhole thread).
It should be harmless, unless I'm missing something.
The thread name is there to be informative for the user.
Thanks, -- Ionel M.
On Fri, Sep 12, 2014 at 3:15 AM, Nir Soffer notifications@github.com wrote:
Currently the sleep is after setting the thread name, and I would like to move it before that - again, to avoid any call which is not necessary in the fork+exec flow.
About the name, I think it is too specific and will make it harder to change the implementation. I think that promising that there will be a delay before the manhole will be reinstalled after a fork is the right level of detail.
— Reply to this email directly or view it on GitHub https://github.com/ionelmc/python-manhole/pull/19#issuecomment-55346723.
Why do we need to set the thread name? This is thread already has a name as it extends threading.Thread.
Also, even if we did not set the thread name (e.g create unamed thread), the thread name is "missing" only during the bind_delay - how can this effect anyone debugging the application?
Can you describe the flow where not setting the thread name effect debugging of the application?
The thread name will show in the process list (eg, you use top or htop). The thread name is like the process name. Normally it's the command line but you can change it (setproctitle sycall)
Thanks, -- Ionel M.
On Fri, Sep 12, 2014 at 2:28 PM, Nir Soffer notifications@github.com wrote:
Why do we need to set the thread name? This is thread already has a name as it extends threading.Thread.
Also, even if we did not set the thread name (e.g create unamed thread), the thread name is "missing" only during the bind_delay - how can this effect anyone debugging the application?
Can you describe the flow where not setting the thread name effect debugging of the application?
— Reply to this email directly or view it on GitHub https://github.com/ionelmc/python-manhole/pull/19#issuecomment-55390714.
Heaving the thread name in top/htop is nice.
However, the chance that you see the unnamed thread (sleeping before name was set) is very small, and even if you see this, the thread will become named in the next sample (few seconds later), so the value of setting the name before sleeping is very small. Since we have different views on this, lets keep this as is. This also means that setup_delay change does make sense.
Do you agree with changing reinstall_bind_delay to reinstall_delay? The main advantage is being able to change the underlying implementation (how reinstall is done) without changing the user visible interface.
Oooops, I missed your reply. Yes, just renaming to reinstall_delay
is fine.
I will send a new smaller patch with the relevant changes.
Recently a reinstall_bind_delay was added to avoid binding of unix socket during fork+exec flow. However the socket is not the only issue in this flow. Changing process signal mask should not be done, as such changes may be inherited by the new process. Generally, since we don't know yet if the manhole is needed, we should do nothing.
This patch changes the semantics of the reinstall_bind_delay so we do not make any modification to child process (except starting a thread) until the delay is finished.
The user visible parameter was shortened to "reinstall_delay", as the user should not be conceded with the internals of the manhole. Internally the Manhole class call it "setup_delay", as this is the delay before setting up the manhole.