dib-lab / khmer

In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more
http://khmer.readthedocs.io/
Other
749 stars 294 forks source link

Callbacks into CPython VM during Multi-threaded Execution #28

Closed emcd closed 9 years ago

emcd commented 11 years ago

We have various places where we wish to notify the user of progress being made during processing. In the original single-threaded model, callbacks were made from the C++ code doing the heavy-lifting to Python. That worked fine because there was never any issue with the GIL being released or acquired and the Python VM could be guaranteed to be in a consistent state. For multi-threaded operation, we cannot really guarantee that we will have the GIL when we need it without a great deal of hassle and a performance hit. Bad things will happen if C++ code attempts to callback into Python when the GIL is not acquired.

Here is a (not necessarily exhaustive) list of alternative mechanisms for user notification: () Use C++ output streams. (We shouldn't have to worry about interference with Python streams since both are not in use at the same time. And, the way the reporting is structured is such that we shouldn't have to worry about multiple threads clobbering each other's output either. I'm +1 on this solution unless I think of something better.) () Redesign the code to perform a certain amount of work, return to the Python wrapper for reporting, and then be resumed. (Potentially messy and would almost certainly break some existing interfaces. I'm -1 on this.) () Setup a dedicated listener thread and send messages to it via an *intra-process communication mechanism of some sort. (Would be a fun excuse to learn about ZeroMQ's inproc stuff, but probably not worth the hassle. So, I'm -1 on this as well.) (*) Set up a synchronization barrier among the threads for getting the lock, reporting, and releasing the lock. (This will hurt performance, possibly quite badly. It will also ruin the threading model-agnostic scheme we currently have. I'm definitely -1 on this.)

Other ideas welcome.

ctb commented 11 years ago

On Thu, Feb 28, 2013 at 05:38:40PM -0800, Eric McDonald wrote:

We have various places where we wish to notify the user of progress being made during processing. In the original single-threaded model, callbacks were made from the C++ code doing the heavy-lifting to Python. That worked fine because there was never any issue with the GIL being released or acquired and the Python VM could be guaranteed to be in a consistent state. For multi-threaded operation, we cannot really guarantee that we will have the GIL when we need it without a great deal of hassle and a performance hit. Bad things will happen if C++ code attempts to callback into Python when the GIL is not acquired.

Here is a (not necessarily exhaustive) list of alternative mechanisms for user notification: () Use C++ output streams. (We shouldn't have to worry about interference with Python streams since both are not in use at the same time. And, the way the reporting is structured is such that we shouldn't have to worry about multiple threads clobbering each other's output either. I'm +1 on this solution unless I think of something better.) () Redesign the code to perform a certain amount of work, return to the Python wrapper for reporting, and then be resumed. (Potentially messy and would almost certainly break some existing interfaces. I'm -1 on this.) () Setup a dedicated listener thread and send messages to it via an *intra-process communication mechanism of some sort. (Would be a fun excuse to learn about ZeroMQ's inproc stuff, but probably not worth the hassle. So, I'm -1 on this as well.) (*) Set up a synchronization barrier among the threads for getting the lock, reporting, and releasing the lock. (This will hurt performance, possibly quite badly. It will also ruin the threading model-agnostic scheme we currently have. I'm definitely -1 on this.)

Other ideas welcome.

Yeah, all those sound grim. My goal is to allow flexibility, but not necessarily for the production scripts (where C++ output streams would be a good solution). Would something like #2 (periodic checking in) have much overhead in scripts where it was never needed? Perhaps we could introduce a general callback mechanism in C++ and enable Python to make use of it, but not have it a "standard" feature in our production scripts?

--titus

C. Titus Brown, ctb@msu.edu

emcd commented 11 years ago

Architecturally, the periodic check-in would be the most challenging design of the various options laid out. (At least to my mind's eyeball at 11 PM. And not that I'm adverse to challenging designs or anything.) In some ways, I believe it is similar to the thinking behind Twisted (Python) or Node (SS JS). If you are looking to use callbacks for something other than console output, then perhaps we would want to invest in it. Did you have something specific in mind or just flexibility in a purely abstract sense?

For the console output case, it sounds like you are OK with C++ streams, and so I'll plan on proceeding along that line. (As it turns out, the Hashbits:: consume_fasta_and_tag logic already had a C++ output immediately prior to the callback, so you were kinda already doing this. :-)

On Thu, Feb 28, 2013 at 10:22 PM, C. Titus Brown notifications@github.comwrote:

On Thu, Feb 28, 2013 at 05:38:40PM -0800, Eric McDonald wrote:

We have various places where we wish to notify the user of progress being made during processing. In the original single-threaded model, callbacks were made from the C++ code doing the heavy-lifting to Python. That worked fine because there was never any issue with the GIL being released or acquired and the Python VM could be guaranteed to be in a consistent state. For multi-threaded operation, we cannot really guarantee that we will have the GIL when we need it without a great deal of hassle and a performance hit. Bad things will happen if C++ code attempts to callback into Python when the GIL is not acquired.

Here is a (not necessarily exhaustive) list of alternative mechanisms for user notification: () Use C++ output streams. (We shouldn't have to worry about interference with Python streams since both are not in use at the same time. And, the way the reporting is structured is such that we shouldn't have to worry about multiple threads clobbering each other's output either. I'm +1 on this solution unless I think of something better.) () Redesign the code to perform a certain amount of work, return to the Python wrapper for reporting, and then be resumed. (Potentially messy and would almost certainly break some existing interfaces. I'm -1 on this.) () Setup a dedicated listener thread and send messages to it via an *intra-process communication mechanism of some sort. (Would be a fun excuse to learn about ZeroMQ's inproc stuff, but probably not worth the hassle. So, I'm -1 on this as well.) (*) Set up a synchronization barrier among the threads for getting the lock, reporting, and releasing the lock. (This will hurt performance, possibly quite badly. It will also ruin the threading model-agnostic scheme we currently have. I'm definitely -1 on this.)

Other ideas welcome.

Yeah, all those sound grim. My goal is to allow flexibility, but not necessarily for the production scripts (where C++ output streams would be a good solution). Would something like #2 (periodic checking in) have much overhead in scripts where it was never needed? Perhaps we could introduce a general callback mechanism in C++ and enable Python to make use of it, but not have it a "standard" feature in our production scripts?

--titus

C. Titus Brown, ctb@msu.edu

— Reply to this email directly or view it on GitHubhttps://github.com/ged-lab/khmer/issues/28#issuecomment-14271445 .

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

emcd commented 11 years ago

One more little note on the C++ output streams though... It is generally considered a bad thing for a library to write anything to console. But, since you've pretty told me that you're only interested in wrapping it with Python and not other languages and since it is tied so closely to a specific set of use cases, I'm willing to violate that particular library-writing taboo in this case. (Still want to be on public record that I am aware of the taboo though. :-)

On Thu, Feb 28, 2013 at 11:06 PM, Eric McDonald emcd.msu@gmail.com wrote:

Architecturally, the periodic check-in would be the most challenging design of the various options laid out. (At least to my mind's eyeball at 11 PM. And not that I'm adverse to challenging designs or anything.) In some ways, I believe it is similar to the thinking behind Twisted (Python) or Node (SS JS). If you are looking to use callbacks for something other than console output, then perhaps we would want to invest in it. Did you have something specific in mind or just flexibility in a purely abstract sense?

For the console output case, it sounds like you are OK with C++ streams, and so I'll plan on proceeding along that line. (As it turns out, the Hashbits:: consume_fasta_and_tag logic already had a C++ output immediately prior to the callback, so you were kinda already doing this. :-)

On Thu, Feb 28, 2013 at 10:22 PM, C. Titus Brown <notifications@github.com

wrote:

On Thu, Feb 28, 2013 at 05:38:40PM -0800, Eric McDonald wrote:

We have various places where we wish to notify the user of progress being made during processing. In the original single-threaded model, callbacks were made from the C++ code doing the heavy-lifting to Python. That worked fine because there was never any issue with the GIL being released or acquired and the Python VM could be guaranteed to be in a consistent state. For multi-threaded operation, we cannot really guarantee that we will have the GIL when we need it without a great deal of hassle and a performance hit. Bad things will happen if C++ code attempts to callback into Python when the GIL is not acquired.

Here is a (not necessarily exhaustive) list of alternative mechanisms for user notification: () Use C++ output streams. (We shouldn't have to worry about interference with Python streams since both are not in use at the same time. And, the way the reporting is structured is such that we shouldn't have to worry about multiple threads clobbering each other's output either. I'm +1 on this solution unless I think of something better.) () Redesign the code to perform a certain amount of work, return to the Python wrapper for reporting, and then be resumed. (Potentially messy and would almost certainly break some existing interfaces. I'm -1 on this.) () Setup a dedicated listener thread and send messages to it via an *intra-process communication mechanism of some sort. (Would be a fun excuse to learn about ZeroMQ's inproc stuff, but probably not worth the hassle. So, I'm -1 on this as well.) (*) Set up a synchronization barrier among the threads for getting the lock, reporting, and releasing the lock. (This will hurt performance, possibly quite badly. It will also ruin the threading model-agnostic scheme we currently have. I'm definitely -1 on this.)

Other ideas welcome.

Yeah, all those sound grim. My goal is to allow flexibility, but not necessarily for the production scripts (where C++ output streams would be a good solution). Would something like #2 (periodic checking in) have much overhead in scripts where it was never needed? Perhaps we could introduce a general callback mechanism in C++ and enable Python to make use of it, but not have it a "standard" feature in our production scripts?

--titus

C. Titus Brown, ctb@msu.edu

— Reply to this email directly or view it on GitHubhttps://github.com/ged-lab/khmer/issues/28#issuecomment-14271445 .

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

ctb commented 11 years ago

On Thu, Feb 28, 2013 at 08:17:41PM -0800, Eric McDonald wrote:

One more little note on the C++ output streams though... It is generally considered a bad thing for a library to write anything to console. But, since you've pretty told me that you're only interested in wrapping it with Python and not other languages and since it is tied so closely to a specific set of use cases, I'm willing to violate that particular library-writing taboo in this case. (Still want to be on public record that I am aware of the taboo though. :-)

More generally, I would like to enable callbacks into Python that let Python decide whether or not to quit. How much does this screw things up?

--titus

emcd commented 11 years ago

I really need to have a good think about this. But, one thing that jumps out at me immediately is the potential for memory leaks. Depending on which part of the logic is deciding to not resume processing, it may be rather difficult to properly invoke a "cleanup handler" for a cancelled operation.

I think you pose an interesting question, but I think there are other architectural refinements I would prefer to perform first before attempting to tackle it..

On Mon, Mar 4, 2013 at 12:19 AM, C. Titus Brown notifications@github.comwrote:

More generally, I would like to enable callbacks into Python that let Python decide whether or not to quit. How much does this screw things up?

--titus

— Reply to this email directly or view it on GitHubhttps://github.com/ged-lab/khmer/issues/28#issuecomment-14364405 .

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

emcd commented 11 years ago

Unless, of course, you mean to exit the process entirely, in which case we don't care about memory leaks but still care about completing output in such a manner that the resulting files are in a non-corrupted and consistent state.

On Tue, Mar 5, 2013 at 7:38 PM, Eric McDonald emcd.msu@gmail.com wrote:

I really need to have a good think about this. But, one thing that jumps out at me immediately is the potential for memory leaks. Depending on which part of the logic is deciding to not resume processing, it may be rather difficult to properly invoke a "cleanup handler" for a cancelled operation.

I think you pose an interesting question, but I think there are other architectural refinements I would prefer to perform first before attempting to tackle it..

On Mon, Mar 4, 2013 at 12:19 AM, C. Titus Brown notifications@github.comwrote:

More generally, I would like to enable callbacks into Python that let Python decide whether or not to quit. How much does this screw things up?

--titus

— Reply to this email directly or view it on GitHubhttps://github.com/ged-lab/khmer/issues/28#issuecomment-14364405 .

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

ctb commented 11 years ago

I was assuming that the call into Python would return a value that would then determine if the C++ code quit. Good clarification :)

On Tue, Mar 05, 2013 at 04:41:52PM -0800, Eric McDonald wrote:

Unless, of course, you mean to exit the process entirely, in which case we don't care about memory leaks but still care about completing output in such a manner that the resulting files are in a non-corrupted and consistent state.

On Tue, Mar 5, 2013 at 7:38 PM, Eric McDonald emcd.msu@gmail.com wrote:

I really need to have a good think about this. But, one thing that jumps out at me immediately is the potential for memory leaks. Depending on which part of the logic is deciding to not resume processing, it may be rather difficult to properly invoke a "cleanup handler" for a cancelled operation.

I think you pose an interesting question, but I think there are other architectural refinements I would prefer to perform first before attempting to tackle it..

On Mon, Mar 4, 2013 at 12:19 AM, C. Titus Brown notifications@github.comwrote:

More generally, I would like to enable callbacks into Python that let Python decide whether or not to quit. How much does this screw things up?

--titus

??? Reply to this email directly or view it on GitHubhttps://github.com/ged-lab/khmer/issues/28#issuecomment-14364405 .

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733


Reply to this email directly or view it on GitHub: https://github.com/ged-lab/khmer/issues/28#issuecomment-14475479

C. Titus Brown, ctb@msu.edu

emcd commented 11 years ago

Ah, okay. Guess I should've seen that thrust of the argument. In that case, I think it wouldn't be too bad, because we will have probably already broken interfaces and have performed some refactoring just to get to the point of having the ability to take breathers from the C++ side. Passing an extra boolean flag shouldn't be a big deal at that point.

On Tue, Mar 5, 2013 at 8:49 PM, C. Titus Brown notifications@github.comwrote:

I was assuming that the call into Python would return a value that would then determine if the C++ code quit. Good clarification :)

On Tue, Mar 05, 2013 at 04:41:52PM -0800, Eric McDonald wrote:

Unless, of course, you mean to exit the process entirely, in which case we don't care about memory leaks but still care about completing output in such a manner that the resulting files are in a non-corrupted and consistent state.

On Tue, Mar 5, 2013 at 7:38 PM, Eric McDonald emcd.msu@gmail.com wrote:

I really need to have a good think about this. But, one thing that jumps out at me immediately is the potential for memory leaks. Depending on which part of the logic is deciding to not resume processing, it may be rather difficult to properly invoke a "cleanup handler" for a cancelled operation.

I think you pose an interesting question, but I think there are other architectural refinements I would prefer to perform first before attempting to tackle it..

On Mon, Mar 4, 2013 at 12:19 AM, C. Titus Brown < notifications@github.com>wrote:

More generally, I would like to enable callbacks into Python that let Python decide whether or not to quit. How much does this screw things up?

--titus

??? Reply to this email directly or view it on GitHub< https://github.com/ged-lab/khmer/issues/28#issuecomment-14364405>

.

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733


Reply to this email directly or view it on GitHub: https://github.com/ged-lab/khmer/issues/28#issuecomment-14475479

C. Titus Brown, ctb@msu.edu

Reply to this email directly or view it on GitHubhttps://github.com/ged-lab/khmer/issues/28#issuecomment-14477643 .

Eric McDonald HPC/Cloud Software Engineer for the Institute for Cyber-Enabled Research (iCER) and the Laboratory for Genomics, Evolution, and Development (GED) Michigan State University P: 517-355-8733

mr-c commented 11 years ago

I've removed this as a requirement for the 1.0 release.

mr-c commented 9 years ago

This is no longer on our roadmap. Closing.