BenLangmead / qtip

Qtip: a tandem simulation approach for accurately predicting read alignment mapping qualities
MIT License
25 stars 7 forks source link

Split qtip into subcommands #5

Open johanneskoester opened 6 years ago

johanneskoester commented 6 years ago

In a sense, qtip mimics a workflow management system. It implements all steps in one command, and therefore you have written a lot of boilerplate code to check existence of intermediate files and so on.

I would think that qtip could also be used like this:

bwa mem $READS | samtools view -Sb - > $BAM
qtip simulate $BAM > $TANDEM_READS
bwa mem $TANDEM_READS  | samtools view -Sb - > $TANDEM_BAM
qtip predict $BAM $TANDEM_BAM | samtools view -Sb - > $FINAL_BAM

Yes, these are more commands, but it is also much more flexible (you don't have to pass arguments to the aligner. Moreover, when doing this via a workflow management system like Snakemake, users can have the steps in separate jobs, let the system handle temporary files and so on. Finally, the complexity of qtip is reduced to a large degree, because all the boilerplate for handling these steps in python can be removed.

BenLangmead commented 6 years ago

This is a good suggestion. I don't agree with the suggestion of removing the "boilerplate," however. Users will want to option to "just run the whole thing."

johanneskoester commented 6 years ago

Sure, that is up to you. As long as the subcommands appear at some point, I will be happy. While debugging qtip on my current dataset, this would be particularly handy, because whenever I encounter an error, I have to rerun the entire thing. If instead the error e.g. only occurs in qtip predict, all the results from before can remain the same and don't have to be recomputed.