Closed xiaoyezao closed 5 years ago
Hello, I met the same issue. Could you give us some suggestions about how to adjust the settings? Thanks, Fuyou
The kmer-distribution from xiaoyezao looks ok. If the assembly is bad, please shift the kmers left by increase -p to 21.
Hello, I am much appreciated for your developing this software. It is much faster than CANU and FALCON. Howerve, I find it is difficult to setup the suitiable parameters. I used the command as following: ~/DIRECTORY/wtdbg2/wtdbg2 -i pb.fasta -t 0 -o str1 -x sq -p 0 -k 15 -AS 2 -s 0.05 -L 1000 -e 1 --edge-min 2 --rescue-low-cov-edges 2>str1.assembly.log My genome is about 1.8 gb with about 2.4% heterozygous rate. I used about 40x pacbio reads. Then I get the k mer frquency as following: ** Kmer Frequency **
|||||||
||||||||||
|||||||||||||
||||||||||||||||
||||||||||||||||||
||||||||||||||||||||||
||||||||||||||||||||||||
|||||||||||||||||||||||||||
||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||
||||||||||||||||||||||||||||||||||||||
||||||||||||||||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ** 1 - 201 ** Quatiles: 10% 20% 30% 40% 50% 60% 70% 80% 90% 95% 31 46 65 91 131 198 328 684 2421 7832
PROC_STAT(0) : real 2172.201 sec, user 7537.710 sec, sys 177.170 sec, maxrss 19098860.0 kB, maxvsize 22911596.0 kB [Tue Mar 26 14:10:41 2019] - high frequency kmer depth is set to 7915 [Tue Mar 26 14:10:42 2019] - Total kmers = 268432957 [Tue Mar 26 14:10:42 2019] - average kmer depth = 74 [Tue Mar 26 14:10:42 2019] - 3554 low frequency kmers (<2) [Tue Mar 26 14:10:42 2019] - 52000 high frequency kmers (>7915) [Tue Mar 26 14:10:42 2019] - indexing 268377403 kmers, 19960619941 instances (at most) The genome assembly results is much worse than using smartdenovo with CANU corrected reads. In addition, I find it will take less time if increase -p value with worse assembly. Based on my issue, could you give me some suggestions or how to adjust -p and -k. I find it is very trick to adjust -p and -k. I assemblied other small genome about 400 mb. I can get a good genome with -p 0 -k15, but if I changed -k 15 to -k 17, the assemblied genome is much worse. Thanks, Fuyou
Wtdbg2 provides presets to setup parameters, in your case, first please try -x sq -g 1.8g
. I am not sure 40X sq data can assemble a good genome with 2.4% heter rate.
Hello, I am much appreciated for you suggestions. I assemblied six plant genome using pacbio raw reads. one of them is I said with about 40x data. Other five of them are about 80x data. However, the results of wtdbg2 is not better than smartdenovo with corrected data. I do not know what it is the reason. Thanks, Fuyou
For corrected reads, please use -x ccs
. wtdbg2 is faster, but sometime get less contiguity than smartdenovo. However, I won't update smartdenovo anymore.
Hi, I am running wtdbg2 on a 1.4G plant genome. We have about 100x Pacbio Sequel data. I use the default settings : wtdbg2 -x sq -t 0 -i *.gz -fo rheum, and get the following output:
Seems that I need to adjust the settings. Can you please give some suggestions?
Thank you !