dzerbino / velvet

Short read de novo assembler using de Bruijn graphs, as published in: D.R. Zerbino and E. Birney. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research, 18: 821-829
https://europepmc.org/article/pmc/2336801
GNU General Public License v2.0
278 stars 99 forks source link

Right sequence file has too many reads #36

Closed madmolecularman closed 6 years ago

madmolecularman commented 7 years ago

Hello!

Hope your having a nice day. I am currently having some issues with velveth during transcriptome assembly of illumina paired end reads. I have concatenated the reads into left and right normalized reads that have different coverages of 30 and 100. I am running velveth on these normalized reads. The normalization program came from trinity's program. I seem to be having an issue with creating the de brujin graph and getting the roadmap.

Error:

"velveth: Right sequence file '/projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_30/right.norm.fq' has too many sequences velveth: Right sequence file '/projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_30/right.norm.fq' has too many sequences velveth: Right sequence file '/projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_30/right.norm.fq' has too many sequences velveth: Right sequence file '/projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_30/right.norm.fq' has too many sequences velveth: Right sequence file '/projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_30/right.norm.fq' has too many sequences velveth: Right sequence file '/projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_30/right.norm.fq' has too many sequences"

The right.norm.fq file has 66909286 The left.norm.fq file has 66009874 The sequences file from oases_velh100_19 has 132019748

This makes sense as it is a concatenation of these two files but it seems to not be resolving it into a roadmap. Should I do an extra step in normalizing or qc for these reads. Thanks for your time!

Sincerely,

The Madmolecularman

madmolecularman commented 7 years ago

Forgot to add the input for the run.

for i in 19 25 31 37 43 49 do /opt/velvet_1.2.10/velveth /projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/oasesvelh100$i $i -fastq -shortPaired -separate /projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_100/left.norm.fq /projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_norm_100/right.norm.fq > /projects/rockfish/transcriptomics/EXP7_Scarn_larvae/PE/Assembly/brown_rockfish_larv_EXP7_oasesvelh$i_out_100.txt done;

Thanks for your time!

madmolecularman commented 7 years ago

Found out my issue! The normalized reads need to have the same number of sequences. Don't know why the first normalization was weird, but the second iteration and a few extra commands solved it.

dzerbino commented 6 years ago

Hello @madmolecularman , Sorry for the slow response, glad you sorted it out in the end.