Hydro3639 / NanoPhase

Reference-quality genome reconstruction from complex metagenomes (or bacterial isolates) using only Nanopore long reads or both long and short reads (hybrid strategy)
MIT License
24 stars 1 forks source link

Confused about error message #1

Closed bcpd closed 1 year ago

bcpd commented 1 year ago

First of all -- excellent analytical tool; thank you. I was able to install/successfully test the nanophase meta pipeline following the usage tutorial. I then tried publicly available datasets (a Zymo mock community sequenced with the R10 kit and an environmental sample sequenced with the R9 kit), but got the following message at the very beginning:

ERROR: Something wrong with long-read (metaflye) assembly, please also check ont-nanophase-out/01-LongAssemblies/tmp/flye.log for more information, terminating...`

I cannot figure out what may be happening from the attached log file:

[2022-12-08 14:50:15] root: INFO: Starting Flye 2.9-b1768
[2022-12-08 14:50:15] root: DEBUG: Cmd: /home/mixtures/miniconda3/envs/nanophase/bin/flye --meta --nano-hq zymo_hmw_r104.fastq.gz -t 10 -i 2 -g 5m -o ont-nanophase-out-zymohmw/01-LongAssemblies/tmp
[2022-12-08 14:50:15] root: DEBUG: Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) 
[GCC 10.3.0]
[2022-12-08 14:50:15] root: INFO: >>>STAGE: configure
[2022-12-08 14:50:15] root: INFO: Configuring run
[2022-12-08 14:51:11] root: INFO: Starting Flye 2.9-b1768
[2022-12-08 14:51:11] root: DEBUG: Cmd: /home/mixtures/miniconda3/envs/nanophase/bin/flye --meta --nano-hq zymo_hmw_r104.fastq.gz -t 10 -i 2 -g 5m -o ont-nanophase-out-zymohmw/01-LongAssemblies/tmp
[2022-12-08 14:51:11] root: DEBUG: Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) 
[GCC 10.3.0]
[2022-12-08 14:51:11] root: INFO: >>>STAGE: configure
[2022-12-08 14:51:11] root: INFO: Configuring run
[2022-12-08 14:52:50] root: INFO: Starting Flye 2.9-b1768
[2022-12-08 14:52:50] root: DEBUG: Cmd: /home/mixtures/miniconda3/envs/nanophase/bin/flye --meta --nano-hq zymo_hmw_r104.fastq.gz -t 10 -i 2 -g 5m -o ont-nanophase-out-zymohmw/01-LongAssemblies/tmp
[2022-12-08 14:52:50] root: DEBUG: Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) 
[GCC 10.3.0]
[2022-12-08 14:52:50] root: INFO: >>>STAGE: configure
[2022-12-08 14:52:50] root: INFO: Configuring run
Hydro3639 commented 1 year ago

Thanks for testing our pipeline. It seems flye assembly failed, but it is a little difficult to track the real reason for this error (due to the pipeline's output limitation, we will improve it in the following release, allowing a more user-friendly log file).

Here are my suggestions: 1) Could you check the input file, the downloaded zymo dataset?; 2) If there is no problem with your _zymo_hmwr104.fastq.gz file, and you can run our example dataset successfully, then you may try running (in the nanophase env) flye --meta --nano-hq zymo_hmw_r104.fastq.gz -t 10 -i 2 -g 5m -o ont-nanophase-out-zymohmw/01-LongAssemblies/tmp to see what the error information is.

Please let me know if you need more help.

bcpd commented 1 year ago

Thank you for your prompt reply. To determine if the file is behind the issue, I had to try #2. Indeed, the file seems to be corrupted:

EOFError("Compressed file ended before the "
Hydro3639 commented 1 year ago

So, it seems that you have an incomplete download. Maybe try downloading them again.

bcpd commented 1 year ago

Thank you. Yes, that's what appeared to have happened. I tried another test file, but this time the pipeline stopped at the binning phase (specifically: during checkM). I will close the issue for now.