adigenova / wengan

An accurate and ultra-fast hybrid genome assembler
GNU Affero General Public License v3.0
84 stars 14 forks source link

make: *** [myse.SPolished.asm.wengan.fasta] Error 134 #33

Closed desmodus1984 closed 3 years ago

desmodus1984 commented 3 years ago

Hi, I am trying to use Wengan with the following dataset: ~ 65X of 100 bps short-read sequencing ~ 15X of Nanopore reads, with N50 ~ 3K and with read length up to 400kbps

My run had the following configuration: singularity exec /fs/scratch/PHS0338/wengan_v0.2.sif perl ${WENGAN} \ -x ontlon \ -a M \ -s reads_1.fq.gz,reads_2.fq.gz \ -l RataCor100.fasta.gz \ -p mzz -t 48 -g 2500 \ and the SPolished.asm.wengan.fasta was only 192M. My genome size is about 2.5 GB, and I gave the job 740 GB and 48 cores.

Could you please help me with what parameter configuration should I use? I ran a MaSurca Assembly with the same files and I got a final fasta of 900 MB.

Also, I ran another assembly but with -x ontlon \ -a A \ and I got the following error messages: [L::iupac2bases] A total of 18 bases were changed. make: *** [myse.SPolished.asm.wengan.fasta] Error 134

Thanks;

adigenova commented 3 years ago

Hi,

We have not tested wengan with 2x100bp reads, thus I recommend work first on the short-read assembly. Since you have a machine with a big-memory you should try WenganD (-a D) also. WenganD uses DiscovarDenovo that is the short-read assembler that better employs the pair-end information. The N50 of your long-read data seems a bit shorter for ONT, do you error-corrected the long-reads? if that is the case is better to give the raw long-read to Wengan. Moreover, set -N to 3 to deal with the low long-read coverage (15X). WenganA fails because abyss is configured to use a k-mer size of 96 by default and your short-reads are just 100bp long, thus is likely that the short-read assembly generated by abyss is not correct. Another alternative is to use Minia3 with the k-mers 31,61,91 using the run_minia.pl script (replace those k-mers in this line ). Then you can input the resulting minia3 assembly to wengan using -c with -a M. If the resulting short-read assembly is really fragmented (N50 ~3-5kb), you should also set the -M option to 1000.

Best, Alex

adigenova commented 3 years ago

I'm closing this issue due to a lack of feedback, feel free to reopen if you have further questions. Best, Alex