rpetit3 / dragonflye

:dragon: :fly: Assemble bacterial isolate genomes from Nanopore reads
GNU General Public License v3.0
117 stars 10 forks source link

trimming and medaka #7

Closed motroy closed 2 years ago

motroy commented 2 years ago

Hi @rpetit3,

Firstly, thank you for this useful tool, great work and great name:)

Secondly, would it be possible (if it is not already taken care of in the tool, and I didn't realize) to add a adapter trimming (and in some case demultiplexing) step? (like in shovill there is a --trim option). We use porechop (https://github.com/rrwick/Porechop), but any other option would be good too.

Thirdly, is it possible to use medaka in gpu mode?

thanks again! Yair

rpetit3 commented 2 years ago

Hi @motroy,

Thank you very much! I think it should be straight-forward to add porechop (via --trim). I honestly haven't added it because I use Dragonflye with Bactopia (https://github.com/bactopia/bactopia/) and Bactopia runs porechop in a previous step. More importantly, if you have nanopore tool recommendations, please pass them my way.

For using GPUs with medaka, I think so, but I currently don't have access to GPUs for testing. Is it as simple as specifying --device (https://github.com/nanoporetech/medaka/blob/master/medaka/medaka.py#L410)?

Cheers, Robert

rpetit3 commented 2 years ago

@motroy for medaka and gpus can you confirm just --device is needed?

rpetit3 commented 2 years ago

Also nevermind on the GPU, looking at the readme looks like tensorflow-gpu would be needed instead of tensorflow

Would you be wiling to test for me?

motroy commented 2 years ago

Hi @rpetit3,

sorry for the very belated response.

re: porechop, we use the defaults (though there are some parameters that look to be potentially useful), however for demultiplexing there are recommendations in the repo (https://github.com/rrwick/Porechop#barcode-demultiplexing) for stringent or lenient binning that may be useful.

re: medaka, we have yet to use it on GPU :) but we can help with testing

rpetit3 commented 2 years ago

No worries!

I released v1.0.8 yesterday which added porechop via --trim, there is also --trimopts to feed additional parameters to Porechop. Not sure if you'll be able to use it for demultiplexing though.

I also updated the Bioconda recipe to include tensorflow-gpu, which I think "should" allow you to use your GPU for the medaka steps.

motroy commented 2 years ago

Hi @rpetit3,

Of note, we ran dragonflye in a singularity container (built from biocontainers/dragonflye:1.0.8--hdfd78af_0) and initially ran into an error with tensorflow: tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)

The error was resolved by setting CUDA_VISIBLE_DEVICES=0 and using the --nv flag in singularity (from this issue). Running with docker may require setting the docker parameter --runtime=nvidia (as mentioned in the referred issue).

thanks for your help and very useful tool, great work, regards, Yair

rpetit3 commented 2 years ago

That's awesome! Glad you were able to get it working with the GPU!

I'm going to add a note to the README and point it to this comment.

I'm going to go ahead and close this, please feel free to reopn