jts / sga

de novo sequence assembler using string graphs
http://genome.cshlp.org/content/22/3/549
237 stars 82 forks source link

abort before outputting results #18

Closed dbrami closed 12 years ago

dbrami commented 12 years ago

Hi,

I got the following error:

cmd-> terminate called after throwing an instance of 'jellyfish::fastq_hash::ErrorReading' what(): Bad file type ' /bin/sh: line 1: 29946 Aborted /bioinformatics/asm/bio_bin/Quake/bin/jellyfish qdump -c filelist-un.dbm > filelist-un.txt.qcts Optimization of distribution likelihood function to choose k-mer cutoff failed. Very likely you have set the value of k too high or not provided adequate coverage (>15x). Inspect the k-mer counts for a clear separation of the error and true k-mer distributions.

real 26m49.410s user 76m51.239s sys 2m45.870s

[1]+ Exit 1 time /bioinformatics/asm/bio_bin/Quake/bin/quake.py -f filelist-un.txt -k 15 -p ${CPUS} -q 33

and the kmer counts file was empty:

33G -rw-rw-r-- 1 dbrami employees 33G May 7 16:38 filelist-un.dbm 0 -rw-rw-r-- 1 dbrami employees 0 May 7 16:38 kmers.txt 0 -rw-rw-r-- 1 dbrami employees 0 May 7 16:38 filelist-un.txt.qcts 0 -rw-rw-r-- 1 dbrami employees 0 May 7 16:38 r.log

Any suggestions on what to do other than try with a lower k value?

jts commented 12 years ago

Hi Daniel,

This looks like a Quake error, not SGA. I'm afraid I can't help you with this.

Jared

dbrami commented 12 years ago

I am so sorry - you are right. I am just flooded with program debugging issues I am cross posting. SGA is very good to me, but I am still trying to get better results out of it for metagenomic asemblies. I am also trying to see if I can speed up SGA assemblies by using the quake error correction (granted it completes) in lieu of the SGA one. I find that the bulk of the assembly time was spent in the up-front error-correction and read trimming...

jts commented 12 years ago

I guess that most of the error correction time is spent in the sga index step - you may wish to try the '-a bcr' option, which is much faster than the default. Right now it only works if your reads are the same length but I will change this soon.

jared