voutcn / megahit

Ultra-fast and memory-efficient (meta-)genome assembler
http://www.ncbi.nlm.nih.gov/pubmed/25609793
GNU General Public License v3.0
588 stars 134 forks source link

Combining pair-end reads and single end in megahit #277

Closed fconstancias closed 4 years ago

fconstancias commented 4 years ago

Dear megahit developers,

I am working on a public dataset on for some reasons I ignore, 10% of the reads have a mate : pair1 and pair2 but the rest of the data are singles reads. In order to perform the (co)-assembly of the samples, I used megahit with the following options: megahit -1 $R1s -2 $R2s -r $RSs --min-contig-len 1000 -m 0.90 -o 02_ASSEMBLY/ -t $NSLOTS I thought it will run all the provided reads, but from the log it seems it is only using -1 & -2 reads.

2020-06-11 22:56:03 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 0 (/datadrive05/Flo/AM/fastq_corrected/DYVR1_allR1.fastq.gz,/datadrive05/Flo/AM/fastq_corrected/DYVR1_allR2.fastq.gz): pe, 71482560 reads, 251 max length' 2020-06-11 23:10:03 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 1 (/datadrive05/Flo/AM/fastq_corrected/DYVR2_allR1.fastq.gz,/datadrive05/Flo/AM/fastq_corrected/DYVR2_allR2.fastq.gz): pe, 74960600 reads, 251 max length' 2020-06-11 23:13:44 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 2 (/datadrive05/Flo/AM/fastq_corrected/DYVR3_allR1.fastq.gz,/datadrive05/Flo/AM/fastq_corrected/DYVR3_allR2.fastq.gz): pe, 75433780 reads, 251 max length'

Is there any way to make use of all the information?

fconstancias commented 4 years ago

Well, I have been a bit impatient, it does.

780 reads, 251 max length' 2020-06-11 23:21:05 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 3 (/datadrive05/Flo/AM/fastq_corrected/DYVR1_allS.fastq.gz): se, 78517440 reads, 251 max length'