statgen / bamUtil

http://genome.sph.umich.edu/wiki/BamUtil
89 stars 30 forks source link

bamutil diff speed ups? #7

Open avilella opened 11 years ago

avilella commented 11 years ago

I am trying to generate 2 bam files that are the differences between 2 bams using bamutil diff. http://genome.sph.umich.edu/wiki/BamUtil:_diff#Usage

After trying it for a while, I find it's exaclty what I need, but it's taken a long time (more than 10 hours and running) to compare two human 30x builds:

~/src/bamUtil/bin/bam diff --mapQual --onlyDiffs --recPoolSize 100000 --posDiff 100000 --in1 $bam1 --in2 $bam2 --out onlyDiffs.bam

Any ideas what I can do to speed it up?

mktrost commented 11 years ago

You could try reducing the posDiff to a lower number. But then it will be less likely to find matching records that are aligned that far apart.

I am trying to generate 2 bam files that are the differences between 2 bams using bamutil diff. http://genome.sph.umich.edu/wiki/BamUtil:_diff#Usage

After trying it for a while, I find it's exaclty what I need, but it's taken a long time (more than 10 hours and running) to compare two human 30x builds:

~/src/bamUtil/bin/bam diff --mapQual --onlyDiffs --recPoolSize 100000 --posDiff 100000 --in1 $bam1 --in2 $bam2 --out onlyDiffs.bam

Any ideas what I can do to speed it up?

— Reply to this email directly or view it on GitHubhttps://github.com/statgen/bamUtil/issues/7 .