ccgd-profile / BreaKmer

A method to identify structural variation from sequencing data in target regions
31 stars 11 forks source link

Taking too long to run! #5

Open ryanabo opened 9 years ago

ryanabo commented 9 years ago

A user commented on a long run time for BreaKmer, below is the correspondence:

The algorithm has been running for more than 48 hours and have still not finished processing the data. So, I was wondering if you can offer some insights about what are the more time consuming steps in the process and how to speeding them up.

Dataset: Target panel of about 10 genes with mean coverage ~700x PC: 16 GB memory ram

skillcoyne commented 9 years ago

I have questions about this as well. I would like to run this genome-wide, but I've been running it on a single chromosome as a test for only rearrangements and translocations for over 72 hours on a cluster node with 28GB RAM. Can BreaKmer be run over a whole genome and how?

ryanabo commented 9 years ago

First, I apologize but BreaKmer was not designed to scale for whole-genome analysis. How are you running it on a single chromosome, just using the genes on that chromosome as targets? I can imagine that the target reference files would take quite a long time to prepare for this. Nonetheless, I don't have any current strategies for the current version to be able to tackle a whole-genome analysis. Future versions may be able to handle it.

I suggest you check out WHAM, developed by Zev Kronenberg. He has developed a nice SV analysis program that is able to handle WGS data.

skillcoyne commented 9 years ago

Ok, that's what I needed to know. I can see how I would use this in targeted analysis but I was asked to run it across a whole genome. Thank you very much for your response.