apfejes / epigenetics-software

This repository contains code for epigenetic analysis, including a chip-seq/chip-chip tool, a database and a web server.
7 stars 1 forks source link

Implemented 'poison pill' method for interrupting multiprocess queue #6

Closed dfornika closed 11 years ago

dfornika commented 11 years ago

I'll caution you to test this one before merging it, or even just take a look at it and do your own thing.

This fix is based on this stackoverflow answer.

http://stackoverflow.com/a/7829756

Essentially, when ctrl-c is pressed, the child processes catches and ignores the KeyboardInterrupt. The main process now catches the KeyboardInterrupt and adds a "poison pill" in the form of the string "exit" to the map_queue. The sub-process checks for the "poison pill" and exits when it is detected.

Note that on line 247 of MapDecomposingThread.py there is space for further clean-up code, that I've likely overlooked.

I added a little extra logging output so you can see the map_queue emptying when the program is interrupted.

The "poison pill" method is also referenced here:

http://pymotw.com/2/multiprocessing/basics.html#terminating-processes

apfejes commented 11 years ago

Hi Dan,

Thanks for the code. I've read the pages on poisoning your queue to stop threads, but I'm not a big fan of it. If you can catch the interrupt and handle it, you should be able to do so and not poison the queue. Would you be willing to handle the interrupt so that main process simply stops adding information to the queue, and the worker processes catch their own interrupt signal and terminate?

I think the end result is cleaner (it simply terminates the main loop in the main program and falls into the finally:), which doesn't require that coders are aware of what's in the queue at run time - my main objection to the poison pill solution.

dfornika commented 11 years ago

Ok, I'm not sure that this is exactly what you had in mind, but it seems to accomplish the same thing without using the queue poisoning strategy.

dfornika commented 11 years ago

A complete run takes a looooooooong time on my machine. I haven't tested a complete run. If you're not in a hurry, I can start one now....

apfejes commented 11 years ago

Gotcha - It takes about 10 minutes to process the chr18/19 on my machine, but that's using 8 threads. Hopefully I'll be able to get that down, once profiling works. So, yeah, fair enough - I'll merge and do a test run.

dfornika commented 11 years ago

For the record, here is the output of my test run (ran with the time command to check running time):

WaveGenerator/WaveGenerator.py
testdata/tmp/sample_input_chipseq.input
testdata/ChipSeqGSE31221/GSE31221_RAW/18_19_GSM773994_TCF7_ChIPSeq.bam
All Processor threads started successfully.
Chromosome processing has started, along with threads to handle output.  Please wait until all threads have completed. Note, threads for processing may continue after the final chromosome input has been read and counted.
chromosome chr18 had 559610 reads
chromosome chr19 had 436484 reads
Completed all Processing - Shutting down.
closing process started...
waiting on map_queue to empty: 162
waiting on map_queue to empty: 162
waiting on map_queue to empty: 46
Processor threads terminated.
 Wigwriter closed.
wave_queue closed
print_queue closed
all closed
Exception in thread Thread-6 (most likely raised during interpreter shutdown):
real    87m1.368s
user    141m48.816s
sys 1m25.113s

...and here is the tail of the output data1.waves file:

chr19   60966190    71  11.95   5
chr19   60966456    59  10.62   6
chr19   61217034    49  13.51   1
chr19   61216712    67  11.37   2
chr19   61216898    39  10.1    3
chr19   61275503    72  214.61  1
chr19   61275598    23  135.52  2
chr19   61275790    65  122.32  3
chr19   61275715    18  58.37   4
chr19   61275880    37  43.92   5
apfejes commented 11 years ago

That's a rather long time - and somewhat disappointing. But, I see several points - one is that you have too many processes for a dual core processor. (Looks like at least 7, to have thread 6 throw an error.) You can, for now, change that in the input file, but maybe we should consider capping it at the number of processors available.

dfornika commented 11 years ago

I was surprised to see that. I had set the processor_threads parameter to 2 on my config file.

apfejes commented 11 years ago

Ok, that's odd. The only place the parameter is set is from the file.

apfejes commented 11 years ago

Ah,just figured it out - we're confusing processes and threads. There are several threads going on, some of which we've created, including the wig writer, the output writer and wave writer. The code converting the maps to waves is actually composed of processes (not threads), so you probably have 3 processes (2 workers + 1 main) and several threads. I don't know what thread 6 was in this instance tho.

apfejes commented 11 years ago

Ok, one more comment - after working on cleaning up the shutdown to remove the terminate() command, I've come to the conclusion that the poison pill is pretty much the only way to make this work.

Dan, I humbly apologize, and will be committing the code tomorrow.

dfornika commented 11 years ago

Haha... no problem.