[x] If the subprocess fails, then make the entire script fail -- don't ignore this and proceed to the next step.
[x] Don't use Popen.communicate() to pipe stuff from minimap2 --> samtools --> samtools, since apparently it uses in-memory buffering (which is not a good idea when dealing with huge datasets). See here for discussion; the easiest option might just be running a shell script from within Python, tbh.
[ ] To reduce complexity, maybe just don't use pipes at all -- in favor of outputting files, then deleting them when they are no longer needed. The problem with this is that it might be slower than piping, and the original SAM files will probably be pretty large... but it'll be easier to reason about, and we risk less problems from weird piping stuff happening.
[x] If the subprocess fails, then make the entire script fail -- don't ignore this and proceed to the next step.
[x] Don't use
Popen.communicate()
to pipe stuff from minimap2 --> samtools --> samtools, since apparently it uses in-memory buffering (which is not a good idea when dealing with huge datasets). See here for discussion; the easiest option might just be running a shell script from within Python, tbh.[ ] To reduce complexity, maybe just don't use pipes at all -- in favor of outputting files, then deleting them when they are no longer needed. The problem with this is that it might be slower than piping, and the original SAM files will probably be pretty large... but it'll be easier to reason about, and we risk less problems from weird piping stuff happening.