statgen / bamUtil

http://genome.sph.umich.edu/wiki/BamUtil
89 stars 30 forks source link

new mapqual diff option #9

Open avilella opened 11 years ago

avilella commented 11 years ago

Hi,

I would like to introduce a new mapQual Diff option, so that the behaviour is as follows:

~/src/bamUtil/bin/bam diff --mapQualDiff 5 --posDiff 0 --in1 $s1 --in2 $s2 --out out.sam

This would generate an only1 and only2 output files where the reads that are differently aligned are printed out, including those reads that are equally aligned but where the mapping quality score is different by 5 or more points difference. If the difference is only below 5 points, they will still be in the common part, and not in the diffs.

I think everything is contained in Diff.cpp, right?

mktrost commented 11 years ago

Sorry for the delay, I was out of the country on vacation. Yes, the Diff logic is all contained in Diff.cpp: Diff::getDiffs. That method identifies if/what is different.

You may also want to look at Diff::writeDiffDiffs & Diff::writeBamDiffs. Depending on your changes to getDiffs, you may/may not need to make changes to those methods too. They handle writing the diffs in the 2 different output formats.

Thank you for using bamUtil and let me know if there is anything more I can help with.