Hydro3639 / NanoPhase

Reference-quality genome reconstruction from complex metagenomes (or bacterial isolates) using only Nanopore long reads or both long and short reads (hybrid strategy)
MIT License
22 stars 1 forks source link

Medaka polishing fails #5

Open jagos01 opened 1 year ago

jagos01 commented 1 year ago

Hello, I am running NanoPhase on the ZymoBIOMICS Microbial Community Standard sequenced with long reads only. I am having problems with medaka polishing. The first problem was an out of memory error which I fixed by changing the batch size to 50 (form the default 100). Medaka still fails to generate a consensus sequence with half of the bins. If I manually run medaka from the environment, all bins work. I have attached the medaka polish log. Any help is appreciated.

commands: nanophase meta -l /home/data1/Analyzed_data/N2022/Nmetactrrun2/guppy_6.1.7/demux_trim/Zymo_Std2/BC96_ZymoStd2.cat.fl250.fastq.gz -m r941_min_sup_g507 -t 50 -o /home/data1/Analyzed_data/N2022/Nmetactrrun2/nanophase/

medaka_consensus -d /home/data1/Analyzed_data/N2022/Nmetactrrun2/nanophase/03-Polishing/Racon/bin.1/bin.1-racon01.fasta -i /home/data1/Analyzed_data/N2022/Nmetactrrun2/nanophase/03-Polishing/LongSeqs/bin.1-lr.fq -o /home/data1/Analyzed_data/N2022/Nmetactrrun2/nanophase/03-Polishing/test/bin.1.racon.medaka.fasta -t 50 -m r941_min_sup_g507 -b 50

I have attached the medaka polish log. Any help is appreciated. Thanks,

medaka.polish.log

Hydro3639 commented 1 year ago

Thanks for your interest in testing our tool and providing your log file and commands. I am sorry to hear you had such issues.

It is weird for me to see _medakaconsensus worked if you ran it manually, but it did not work when using nanophase meta command. What nanophase meta does in this step is just run these medaka polish commands parallelly to speed up the analysis. How about setting the threads option to -t 2? In this way, it will run _medakaconsensus one by one.

I also noticed that you are using a GPU for polishing (from the medaka.polish.log), which I never tried before. But based on the log file, could you try setting the environment variable TF_FORCE_GPU_ALLOW_GROWTH=true (line 36, line 435, etc. in the log file)? I also searched the problem in medaka issues, I am not sure if this can help, if you are interested in, please have a look. I am sorry that I cannot make sure it can solve all problems, as I can see there are also some libcublas.so.11 errors. It is odd that some work but some didn't.

Thanks again for bringing this issue to my attention. I will keep it mind to see how I can improve nanophase with GPU settings. Please let me know if I can provide more help.

jagos01 commented 1 year ago

Hello, Thank you for the suggestions. I had also set the environment variable yesterday. I did see the libcublas errors and will try to resolve them soon. In the mean time I will set the threads to 2. Thanks

yotsa commented 1 year ago

Hi,

I have the same problem with medaka using GPU.

How do I set "TF_FORCE_GPU_ALLOW_GROWTH=true" in nanophase though?

best Yot

Hydro3639 commented 1 year ago

Hi Yot,

I am sorry you had similar problems. You can set the environment variable with the following command under the nanophase env

export TF_FORCE_GPU_ALLOW_GROWTH=true

Best