combogenomics / medusa

A draft genome scaffolder that uses multiple reference genomes in a graph-based approach.
http://combo.dbe.unifi.it/medusa/
GNU General Public License v3.0
42 stars 15 forks source link

Parallelization #15

Closed anandksrao closed 6 years ago

anandksrao commented 7 years ago

I wonder if parallelization of the underlying MUMmer would accelerate MeDuSa runs to make it much more attractive. For example, I wonder if this https://github.com/fritzsedlazeck/sge_mummer might be something to easily incorporate in your existing MeDuSa analysis pipelines. Thoughts?

EBosi commented 7 years ago

Hi, Thanks for the heads up, I'll look into it!

Il 15 apr 2017 22:44, "anandksrao" notifications@github.com ha scritto:

I wonder if parallelization of the underlying MUMmer would accelerate MeDuSa runs to make it much more attractive. For example, I wonder if this https://github.com/fritzsedlazeck/sge_mummer might be something to easily incorporate in your existing MeDuSa analysis pipelines. Thoughts?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/combogenomics/medusa/issues/15, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQhmYdU0YH_dvWZmyQP9jARVCobCBL2ks5rwSxFgaJpZM4M-dKm .

anandksrao commented 7 years ago

Thanks, Emanuele! Is this something that may be quickly implemented (if at all) OR is this something that may not feature in the next version any time soon? Asking, just so, I can plan runs for the next week/month accordingly. Cheers!

EBosi commented 7 years ago

It can be definitely implemented but not quickly (I have some pressing work for the next weeks)... sorry about that, I hope it doesn't impair you too much. If you feel confident about the code you might try changing the nucmer wrapper in the medusa libraries, it would speed up the release of an updated version.

On Mon, Apr 17, 2017 at 9:18 PM, anandksrao notifications@github.com wrote:

Thanks, Emanuele! Is this something that may be quickly implemented (if at all) OR is this something that may not feature in the next version any time soon? Asking, just so, I can plan runs for the next week/month accordingly. Cheers!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/combogenomics/medusa/issues/15#issuecomment-294579674, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQhmdsQ71e3aMp2fGblW_WG5zgr8y3lks5rw8kogaJpZM4M-dKm .

anandksrao commented 7 years ago

I think it might be best for the authors - you and Fritz to discuss this mutually. I am more of a software user than a coder. But if you need testing help and other downstream help, I'd be most happy to invest some time into this. OR if you are not opposed, I could inquire on biostars, for example, regarding how to change the nucmer wrapper inside MeDuSa to incorporate SGE_MUMmer or SLURM_MUMmer. Thoughts?

EBosi commented 7 years ago

I will work with Fritz (he is a bit busy as welk) but it will take a bit... I will let you know as soon as we have something

Il 18 apr 2017 17:00, "anandksrao" notifications@github.com ha scritto:

I think it might be best for the authors - you and Fritz to discuss this mutually. I am more of a software user than a coder. But if you need testing help and other downstream help, I'd be most happy to invest some time into this. OR if you are not opposed, I could inquire on biostars, for example, regarding how to change the nucmer wrapper inside MeDuSa to incorporate SGE_MUMmer or SLURM_MUMmer. Thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/combogenomics/medusa/issues/15#issuecomment-294872584, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQhmTvOv5P9qBBPDtoQxmvcZ1JxunOQks5rxNAAgaJpZM4M-dKm .

anandksrao commented 7 years ago

Thank you, Emanuele, and to Fritz as well. Yes, please update us / MeDuSa version when you have a faster / parallelized one - that would be SO MUCH FUN! :)

anandksrao commented 7 years ago

MUMmer 4.0 is parallelized! https://github.com/gmarcais/mummer/blob/master/MANUAL.md https://github.com/gmarcais/mummer From this link - http://mummer.sourceforge.net/ "The major changes in MUMmer4 primarily affect nucmer, which can now handle genomes of unlimited size and now runs multi-threaded. A paper is in preparation; stay tuned."

davidries84 commented 6 years ago

I changed the according line in mummerRunner.sh to /path/to/nucmer4.0.0 --threads=30 --prefix=$prefix $file1 $file2 and it works.

It's all hard coded though, so medusa would need an option for the number of threads, and maybe check which version of nucmer is installed. Using nucmer 4 has also the advantage, that much larger genomes can be used by default.

Best,

David

EBosi commented 6 years ago

Hi, many thanks for the hint, I changed the code accordingly (there is a -threads option now), can't really test if it works now, can you please try it? Best, Emanuele

On Mon, Oct 16, 2017 at 1:11 PM, David Ries notifications@github.com wrote:

I changed the according line in mummerRunner.sh to /path/to/nucmer4.0.0 --threads=30 --prefix=$prefix $file1 $file2 and it works.

It's all hard coded though, so medusa would need an option for the number of threads, and maybe check which version of nucmer is installed. Using nucmer 4 has also the advantage, that much larger genomes can be used by default.

Best,

David

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/combogenomics/medusa/issues/15#issuecomment-336855471, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQhmdBxGwYweNXKrozgjtU9rImUB6Azks5ssznqgaJpZM4M-dKm .

davidries84 commented 6 years ago

Hi, alright, I tested it and it works. Can you help with speeding up the network cleaning step?

Best,

David

EBosi commented 6 years ago

Hi David, unfortunately that requires much more effort which I can't guarantee right now... I'm closing this issue, I will keep you updated on the other thread. Best, Emanuele

On Tue, Oct 17, 2017 at 9:53 AM, David Ries notifications@github.com wrote:

Hi, alright, I tested it and it works. Can you help with speeding up the network cleaning step?

Best,

David

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/combogenomics/medusa/issues/15#issuecomment-337148557, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQhmd3742Q3J_IyaI_QvfhHkzWXwXw1ks5stFzrgaJpZM4M-dKm .