mikolmogorov / Flye

De novo assembler for single molecule sequencing reads using repeat graphs
Other
781 stars 168 forks source link

ERROR: Error running minimap2 #351

Closed kazuaki0816 closed 3 years ago

kazuaki0816 commented 3 years ago

I performed a De novo assembly using PacBio CLR reads with approximately 100 coverage (Genome size: 4.7 Gb). At that time, the following error occurred. ERROR: Error running minimap2, terminating. See the alignment error log for details: /home/local/flye_output/40-polishing/minimap.stderr [2021-02-06 09:59:12] root: ERROR: Command '['/bin/bash', '-c', 'set -o pipefail; flye-minimap2 /home/local/flye_output/40-polishing/chunks_1.fasta /home/local/PacBio/Jan05-2021/sequel2-reads-6cells.fasta -x map-pb -t 128 -a -p 0.5 -N 10 --sam-hit-only -L -Q --secondary-seq -I 64G | flye-samtools view -T /home/local/flye_output/40-polishing/chunks_1.fasta -u - | flye-samtools sort -T /home/local/flye_output/40-polishing/sort_210206_040958 -O bam -@ 4 -l 1 -m 4G']' returned non-zero exit status 1.

I suspected this error was due to an open file limit, as in past issue, and changed that number with the ulimit -n command.

I have questions about the error.

  1. Is the troubleshooting I performed correct? Is there any other solution?

  2. What commands do I need to run to do the flye-minimap and subsequent commands step-by-step?

Best regards,

mikolmogorov commented 3 years ago

Please post flye.log and flye_output/40-polishing/minimap.stderr from flye output directory.

kazuaki0816 commented 3 years ago

I attach the log files. flye.copy1.log.gz minimap.stderr.txt

mikolmogorov commented 3 years ago

Thank you - could it be that the machine ran out disk of space? You have ~0.5Tb of input data - I'd say you might need up to 1-1.5Tb of temporary space to process everything.

Try to ensure that you have enough space, and then restart the (last) polishing stage by adding --resume-from polishing. I would also try reducing the number of threads to 64.

kazuaki0816 commented 3 years ago

I have a 5.8Tb available disk. The ulimit -n command didn't solve the problem either, so I'll try with 64 threads.

mikolmogorov commented 3 years ago

Closing due to inactivity - feel free to reopen if the issue is still unsolved.

marc2680 commented 3 years ago

I seem to have the same issue running Flye (from within Geneious) and it might be due to the latest Flye builds, as another lab used Flye 2.8.1 on the same data set and that worked. Relevant sections of the error logs:

-----------End assembly log------------ [2021-03-11 16:08:38] root: DEBUG: Disjointigs length: 12242670, N50: 2887054 [2021-03-11 16:08:38] root: INFO: >>>STAGE: consensus [2021-03-11 16:08:39] root: INFO: Running Minimap2 [2021-03-11 16:09:34] root: ERROR: Error running minimap2, terminating. See the alignment error log for details: out/10-consensus/minimap.stderr [2021-03-11 16:09:34] root: ERROR: Command '['/bin/bash', '-c', 'set -o pipefail; flye-minimap2 out/10-consensus/chunks.fasta input_0_Unpaired.fastq -x map-ont -t 8 -a -p 0.5 -N 10 --sam-hit-only -L -Q --secondary-seq | flye-samtools view -T out/10-consensus/chunks.fasta -u - | flye-samtools sort -T out/10-consensus/sort_210311_160839 -O bam -@ 4 -l 1 -m 500M']' returned non-zero exit status 1. [2021-03-11 16:09:34] root: ERROR: Pipeline aborted

-----End of corresponding minimap.stderr--- [E::sam_parse1] query name too long [W::sam_read1] Parse error at line 13 [main_samview] truncated file.

marc2680 commented 3 years ago

Oh, and in my case these were MinION data (1.6 Gb).

mikolmogorov commented 3 years ago

@marc2680 based on the error log, it looks like you read names are longer than 255 symbols - is that so?

marc2680 commented 3 years ago

You are right, the read names are about 300 characters long. I now batch renamed all runs, and the assembly worked like a charm. Thanks a lot for your help!

naushintab commented 2 years ago

Hi, I have been facing some trouble running flye for my nanopore metagenome samples. Here is the error I have been receiving: flye --nano-raw combined_barcode01.fastq.gz --out-dir assembled -t 10 -i 2 --meta [2022-04-25 16:33:07] INFO: Starting Flye 2.8.3-b1695 [2022-04-25 16:33:07] INFO: >>>STAGE: configure [2022-04-25 16:33:07] INFO: Configuring run [2022-04-25 16:33:12] INFO: Total read length: 216520271 [2022-04-25 16:33:12] INFO: Reads N50/N90: 3479 / 2493 [2022-04-25 16:33:12] INFO: Minimum overlap set to 2000 [2022-04-25 16:33:12] INFO: >>>STAGE: assembly [2022-04-25 16:33:12] INFO: Assembling disjointigs [2022-04-25 16:33:12] INFO: Reading sequences [2022-04-25 16:33:17] INFO: Counting k-mers: 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% [2022-04-25 16:34:05] INFO: Filling index table (1/2) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% [2022-04-25 16:34:17] INFO: Filling index table (2/2) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% [2022-04-25 16:34:31] INFO: Extending reads [2022-04-25 16:34:33] INFO: Overlap-based coverage: 3 [2022-04-25 16:34:33] INFO: Median overlap divergence: 0.119569 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% [2022-04-25 16:35:07] INFO: Assembled 182 disjointigs [2022-04-25 16:35:07] INFO: Generating sequence 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% [2022-04-25 16:35:07] INFO: >>>STAGE: consensus [2022-04-25 16:35:07] INFO: Running Minimap2 [2022-04-25 16:35:07] ERROR: Error running minimap2, terminating. See the alignment error log for details: /home/ngri/Nanopore Sequencing Data (FCBR)/Fastq_pass_files/R4/barcode01/assembled/10-consensus/minimap.stderr [2022-04-25 16:35:07] ERROR: Command '['/bin/bash', '-c', 'set -o pipefail; flye-minimap2 /home/ngri/Nanopore Sequencing Data (FCBR)/Fastq_pass_files/R4/barcode01/assembled/10-consensus/chunks.fasta combined_barcode01.fastq.gz -x map-ont -t 10 -a -p 0.5 -N 10 --sam-hit-only -L -z 1000 -Q --secondary-seq -I 64G | flye-samtools view -T /home/ngri/Nanopore Sequencing Data (FCBR)/Fastq_pass_files/R4/barcode01/assembled/10-consensus/chunks.fasta -u - | flye-samtools sort -T /home/ngri/Nanopore Sequencing Data (FCBR)/Fastq_pass_files/R4/barcode01/assembled/10-consensus/sort_220425_163507 -O bam -@ 4 -l 1 -m 1G']' returned non-zero exit status 1. [2022-04-25 16:35:07] ERROR: Pipeline aborted and here are the log files: drive-download-20220425T110157Z-001.zip

Any help is appreciated, thanks!

mikolmogorov commented 2 years ago

@naushintab looks like a bracket symbol in your file path is causing the issue. The most recent Flye release (2.9) should work.