Open comingkms opened 4 months ago
Hi,
May I know the version of nanophase you installed? And for the long-read dataset (lr.fa.gz), are you using SRR17913199 (if not, I would suggest to use this one)?
nanophase v=0.2.3 I used the long-read dataset from https://github.com/example-data/np-example mentioned in your README. Will try SRR17913199.
Thanks
besides the dataset, i also noticed that there was a mismatch between the version of Perl and the version of the Encode
module in your provided log file. So maybe consider to reinstall the Encode
module in the nanophase
env to ensure that it is properly compiled against your current Perl version (the command you may refer to: cpan -f -i Encode)
Yes, I've realized that, but I still had the issue with SemiBin which has been mentioned before due to the simple dataset. I'll try your SRR17913199. Should I remove the host genome first to improve bacterial assemblies?
Thanks,
Glad you have resolved it! Yes, due to the limitations of the simple dataset, nanophase will exit during the semibin stage. That's why I provided the whole mock dataset: SRR17913199. Just give it a try.
I would suggest removing host genomes before the bacterial assemblies, it will make the assembly process easier and faster and lower the potential contamination.
Best
Using SRR17913199, " I got the following error: " ERROR: Something wrong with medaka polishing, please also check nanophase-out/03-Polishing/medaka/medaka.polish.log, terminating..." Please check the attached medaka.log, indicating out of memory issue. I am just wondering how big GPU is needed ? Mine is RTX 3090 (24G) medaka.polish.log
I don't have much experience running Medaka polish with a GPU. If it is a GPU memory issue, you might consider lowering the number of threads to 2 (-t 2
). This way, nanophase will only run Medaka polish once at a time, reducing the GPU memory requirement. Alternatively, you can use the CPU for polishing by setting export CUDA_VISIBLE_DEVICES=""
Finally, I could complete the run using SRR17913199 with some modifications:
Hi, thank you for the suggestion regarding the memory usage of pplacer. I agree that pplacer requires substantial memory, and using the --scratch_dir
flag can reduce the memory load during this step.
The main reason we didn't include this flag in the default settings is that, for many environmental samples with higher complexity than the mock dataset, the most memory-intensive step is actually the long-read assembly with Flye. Although Flye is quite memory-efficient, it remains the primary memory-consuming process. I may consider adding an optional flag for --scratch_dir
in the future to give users more flexibility in managing memory usage but it is not a high-priority task at the moment.
Hi,
No issue for installation, but I got the error when running your test dataset. Thanks,
nanophase meta -l '/home/Downloads/lr.fa.gz' -t 12 -o nanophase-out [2024-06-20 14:48:58] INFO: nanophase (meta) starts [2024-06-20 14:48:58] INFO: Command line: /home/comingkms/anaconda3/envs/nanophase/bin/nanophase meta -l /home/comingkms/Downloads/lr.fa.gz -t 12 -o nanophase-out [2024-06-20 14:48:58] INFO: long_read_only model was selected, only Nanopore long reads will be used [2024-06-20 14:48:58] CHECK: Nanopore long-read (fa.gz) file has been found [2024-06-20 14:48:58] CHECK: Check software availability and locations [2024-06-20 14:48:59] INFO: The following packages have been found
package location
nanophase /home/comingkms/anaconda3/envs/nanophase/bin/nanophase flye /home/comingkms/anaconda3/envs/nanophase/bin/flye metabat2 /home/comingkms/anaconda3/envs/nanophase/bin/metabat2 maxbin2 /home/comingkms/anaconda3/envs/nanophase/bin/run_MaxBin.pl SemiBin /home/comingkms/anaconda3/envs/nanophase/bin/SemiBin metawrap /home/comingkms/anaconda3/envs/nanophase/bin/metawrap checkm /home/comingkms/anaconda3/envs/nanophase/bin/checkm racon /home/comingkms/anaconda3/envs/nanophase/bin/racon medaka /home/comingkms/anaconda3/envs/nanophase/bin/medaka polypolish /home/comingkms/anaconda3/envs/nanophase/bin/polypolish POLCA /home/comingkms/anaconda3/envs/nanophase/bin/polca.sh bwa /home/comingkms/anaconda3/envs/nanophase/bin/bwa seqtk /home/comingkms/anaconda3/envs/nanophase/bin/seqtk minimap2 /home/comingkms/anaconda3/envs/nanophase/bin/minimap2 BBMap /home/comingkms/anaconda3/envs/nanophase/bin/BBMap parallel /home/comingkms/anaconda3/envs/nanophase/bin/parallel perl /home/comingkms/anaconda3/envs/nanophase/bin/perl samtools /home/comingkms/anaconda3/envs/nanophase/bin/samtools gtdbtk /home/comingkms/anaconda3/envs/nanophase/bin/gtdbtk fastANI /home/comingkms/anaconda3/envs/nanophase/bin/fastANI All required packages have been found in the environment. If the above certain packages integrated into nanophase were used in your investigation, please give them credit as well :) [2024-06-20 14:48:59] TASK: Long-read assembly starts (be patient) [2024-06-20 14:55:31] DONE: long-read assembly finished successfully: detailed log file is nanophase-out/01-LongAssemblies/flye.log [2024-06-20 14:55:31] TASK: Initial binning::metabat2 binning starts [2024-06-20 14:55:32] DONE: Initial binning::metabat2 binning finished successfully MetaBAT 2 (v2.12.1) using minContig 2500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200. 1 bins (2028309 bases in total) formed. [2024-06-20 14:55:32] TASK: Initial binning::maxbin2 binning starts Can't load '/home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Encode/Encode.so' for module Encode: /home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Encode/Encode.so: undefined symbol: Perl__is_utf8_char_helper at /home/comingkms/anaconda3/envs/nanophase/lib/perl5/core_perl/XSLoader.pm line 93. at /home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/Encode.pm line 12. BEGIN failed--compilation aborted at /home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/Encode.pm line 13. Compilation failed in require at /home/comingkms/anaconda3/envs/nanophase/lib/perl5/site_perl/LWP/UserAgent.pm line 1073. Compilation failed in require at /home/comingkms/anaconda3/envs/nanophase/bin/run_MaxBin.pl line 4. BEGIN failed--compilation aborted at /home/comingkms/anaconda3/envs/nanophase/bin/run_MaxBin.pl line 4. mv: cannot stat 'nanophase-out/02-LongBins/INITIAL_BINNING/maxbin2/bin*fasta': No such file or directory [2024-06-20 14:55:32] ERROR: Something wrong with maxbin binning, terminating...