broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
340 stars 60 forks source link

Multithreading #5

Closed stephenrdoyle closed 8 years ago

stephenrdoyle commented 8 years ago

Not such an issue, but more of a question...

In the manual/help, it suggests using more than a single thread is "experimental".

--threads Degree of parallelism to use for certain processing (default 1). Experimental.

Do you still consider this to be the case, and do you have any feel for what might not work so well? Its is it that it has not been thoroughly tested?

Many thanks Steve

tseemann commented 8 years ago

i had the same question. i've been using --threads 36 and it does use them all for a short period but most of the time is only using 2-8 threads.

stephenrdoyle commented 8 years ago

@tseemann have you played with this at all? What sort of run times for a given genome size are typical for you?

Trying to work our how sensible it might be for 350-700-Mb genomes...

w1bw commented 8 years ago

Sorry for taking so long to respond. Questions and issues seem to be moving here rather than the pilon-users mailing list. That's fine, but I need to keep a closer eye here!

I'm using a very coarse-grained way of doing parallelism with the --threads option: it is distributing the input FASTA elements across that many threads using a high-level scala collection-level parallelism. With more work, the parallelism could certainly be made more efficient. I frankly have not tried the parallelism on large genomes, but on smaller (e.g., fungal) assemblies, I was seeing up to about 4x speedup.

The problem is that it still requires holding a lot of information in memory, and triggers an awful lot of garbage collection. There's probably room to tune those parameters as well, but as it is, it gives some advantage but not an order of magnitude.

Anyway, that's why I considered it "experimental"; it was a quick-and-dirty approach which gave some benefit. Also, it was put in to try to help people with applications with large numbers of input scaffolds, but at the time I added it and since, I haven't had such problems myself, so I didn't really have a means for doing a lot of testing & tuning to make it more efficient.

stephenrdoyle commented 8 years ago

Many thanks for the update. I had a feeling it was holding a lot of memory and hung a few times on it. I think the simplest option is to split the bam file by chromosome and input smaller chunks. Will keep having a play.