dzerbino / velvet

Short read de novo assembler using de Bruijn graphs, as published in: D.R. Zerbino and E. Birney. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research, 18: 821-829
https://europepmc.org/article/pmc/2336801
GNU General Public License v2.0
281 stars 100 forks source link

"not seem to be in FastQ format" error #29

Closed homopolymer closed 9 years ago

homopolymer commented 10 years ago

Hi Zerbino

I am trying to run velvet in a small roche/454 dataset. However, as I tried to run velveth, an error "velveth: ls454_read.fastq does not seem to be in FastQ format" came out. I have no idea why it became an invalid fastq file, as I have tried other assemblers in that fastq file.

Here is the the command that I use "velveth Assemb 21 -long -fastq ls454_read.fastq"

Here is the results of "head ls454_read.fastq" @SRR003796.187567 CCCAGGGCTTTGGGAGGCTGAGACAGGTCGACCATCTGAGGTCAGGATTGAGAACAGCCTGGCCAACATGGTGAAACCCTGTCTTTACTAAACATACAAAAATTAGCTGGGTGTGGTGGGGGGCACCTATAGTCCCAGCTACTTGAGAGGCCAAAGCAGGACAATCACTTGAACCTGGAAGGCAAAGACTGCAGTGAGCTAAGATCATGCCACAGCACTACAGCCTGGGTGACAGGGTGAGACTGCCTCAAAAAACAAAAGCAAAAAACCCTTTTATTGACAAATTTTACAAAGACTGATACTAGAGTTTGTTGGACAGGCCGTTGTCCCCACATAATATCTTAAGAGTTGCCGATGCACAAGTAAGGTGGTAAAATGCCTTTGAAAAACTGACCATACCTGCTATACTTAAATATTCAGATACAAAACTCCCAGAAATTCCATCTTGTCATTTATTCCACAAGAAAAAACAGAGACCTGTAATTAAAATATCCCTAATTTCCTAATTAAAATTATCACACTAGGTTTCTTCTAAATTTTTTT + ???@?AA@@AA@@@@@@@@@@@@>A@@?@@=@A??A@@?@??>@@?@@@@@@>A@?@@@@@@A@>A@A@@@@@@>@@AA@@AAA@>@@@@>A@@?=;;;;?@;@>A@@@@@@@?=:<<<<=A>@A@@@@@@@@A@@A@AAA@@@@@@@@???@A@A@?A@@@A>AAA@@>@AAA@@@A@A@@@>?@@@A@@A@@@A@@?@@@@@A??A>A@@A>A@>@@?@AAA?@:@=>;>>;:==@=A=>=:;:::::99999:=;::::::==>@;;9:;<@:>>@AAA=A@@?@=@A@?@=A?@;==?:9:=@<?:<;99:?:9<<<?:A:8:?;=?AA?@?@==@@@@:@?@@A;@@?@@?A==?:<<?@@@?@@?@@@@@=A?@=?@@@@A@?@===@@;>;889;?>=;=:99<:=:>>A:=??=?=>?;:8988=>8999:97::9:98:8<<;;:8:<:=:?<<<<<?A?@@@=:==>=;8997::<899999:9:9;<==:<:8:9999:9:<:;;:;>>=99

dzerbino commented 10 years ago

Did you look at the tail of your file for an empty line or corrupt data? The only test within velvet is to check that every fourth line starts with @.

Try: awk 'NR % 4 == 1 && ! /^@/ {print "Error on line "NR}' ls454_read.fastq

homopolymer commented 10 years ago

I tried your awk command, and no error found in my fastq file.

By the way, I git-clone velvet code and compile with gcc48.

Best Regards

在 Sep 30, 2014,12:27,Daniel Zerbino notifications@github.com 写道:

Did you look at the tail of your file for an empty line or corrupt data? The only test within velvet is to check that every fourth line starts with @.

Try: awk 'NR % 4 == 1 && ! /^@/ {print "Error on line "NR}' ls454_read.fastq

— Reply to this email directly or view it on GitHub.

dzerbino commented 10 years ago

Any chance you can send me a copy of the file? Or else a complete stack trace (as explained at the end of the manual)?

homopolymer commented 10 years ago

Sure, it is in the attachment.  It is a very small data of 108 reads extracted from human resequencing data, and is used to test whether the assembler could work in my computer in this small data before I go for large file.  Original Message  Sender: Daniel Zerbinonotifications@github.comRecipient: dzerbino/velvetvelvet@noreply.github.comCc: Feng Zengzeng.bupt@gmail.comDate: Wednesday, Oct 1, 2014 22:26Subject: Re: [velvet] "not seem to be in FastQ format" error (#29)Any chance you can send me a copy of the file? Or else a complete stack trace (as explained at the end of the manual)?

—Reply to this email directly or view it on GitHub.

dzerbino commented 10 years ago

Sorry GitHub does not store the attachment, could please send directly to me (zerbino at ebi dot ac dot uk)?

dzerbino commented 9 years ago

Could not be reproduced, maybe an end of line difference between machines.