ruanjue / smartdenovo

Ultra-fast de novo assembler using long noisy reads
GNU General Public License v3.0
127 stars 29 forks source link

Final Assembly File (*.cns) is not generated #14

Closed VivekTodur closed 6 years ago

VivekTodur commented 6 years ago

Hi,

A successful assembly would generate prefix.cns file in the same folder, But in my case, I am not finding this file, Here is the complete log, could you please help in understanding this.

PS: Before that, I am having PacBio Sequel data in fasta format (Quality filtered though).

$ make -f TestSample.mak make: Warning: File `TestSample.mak' has modification time 18 s in the future ~/smartdenovo/wtpre -J 5000 Reads.fasta | gzip -c -1 > TestSample.fa.gz ~/smartdenovo/wtzmo -t 10 -i TestSample.fa.gz -fo - -k 17 -s 200 -m 0.6 | cut -f1-16 > TestSample.zmo.ovl.short [Fri Jan 19 12:47:16 2018] loading long reads [Fri Jan 19 12:48:39 2018] Done, 389105 reads (length >= 0) [Fri Jan 19 12:48:41 2018] sorted sequences by length dsc [Fri Jan 19 12:48:41 2018] calculating overlaps, 10 threads [Fri Jan 19 12:48:41 2018] indexing 1/1 [Fri Jan 19 12:48:41 2018] - scanning kmers (17 bp) 389105 reads [Fri Jan 19 12:52:26 2018] - high frequency kmer depth is set to 155 [Fri Jan 19 12:52:26 2018] - average kmer depth = 31 [Fri Jan 19 12:52:26 2018] - 402525 high frequency kmers (>=155) [Fri Jan 19 12:52:26 2018] - indexing 20926250 kmers 389105 reads [Fri Jan 19 12:55:53 2018] Done [Fri Jan 19 12:55:53 2018] querying 1/1 000000025600 320978 000000025800 322253 000000025900 323051 000000130400 715924 000000137500 727353^[^R progress: 389105 833476 100.00%, 105970.32 CPU seconds [Fri Jan 19 15:45:47 2018] Done ~/smartdenovo/wtgbo -t 10 -i TestSample.fa.gz -j TestSample.zmo.ovl.short -fo - | cut -f1-16 > TestSample.zmo.gbo.short [Fri Jan 19 15:45:47 2018] loading reads [Fri Jan 19 15:47:00 2018] Done, 389105 reads [Fri Jan 19 15:47:00 2018] No obt information

[Fri Jan 19 15:47:00 2018] iteration 1 [Fri Jan 19 15:47:00 2018] loading alignments loaded 833476 overlaps building edges 380774 fine overlaps [Fri Jan 19 15:47:02 2018] Done [Fri Jan 19 15:47:02 2018] calculating edge coverage ... [Fri Jan 19 15:47:03 2018] removed 30 duplicate edges [Fri Jan 19 15:47:03 2018] Done [Fri Jan 19 15:47:03 2018] masked 85365 contained reads [Fri Jan 19 15:47:03 2018] masked 71350 low coverage (<1) edges [Fri Jan 19 15:47:03 2018] 'best_overlap' cut 400653 non-best edges [Fri Jan 19 15:47:03 2018] graph based overlapping [Fri Jan 19 16:09:58 2018] 389105 [Fri Jan 19 16:09:58 2018] 281326 candidates [Fri Jan 19 16:09:58 2018] Done, 44582 new overlaps [Fri Jan 19 16:09:58 2018] anchoring based overlapping [Fri Jan 19 16:18:53 2018] 389105 [Fri Jan 19 16:18:53 2018] 45340 candidates [Fri Jan 19 16:18:53 2018] Done, 9790 new overlaps

[Fri Jan 19 16:18:53 2018] iteration 2 [Fri Jan 19 16:18:53 2018] bulding edges building edges 435146 fine overlaps [Fri Jan 19 16:18:53 2018] Done [Fri Jan 19 16:18:53 2018] calculating edge coverage ... [Fri Jan 19 16:18:53 2018] removed 49433 duplicate edges [Fri Jan 19 16:18:53 2018] Done [Fri Jan 19 16:18:53 2018] masked 85482 contained reads [Fri Jan 19 16:18:53 2018] masked 52618 low coverage (<1) edges [Fri Jan 19 16:18:53 2018] 'best_overlap' cut 489099 non-best edges [Fri Jan 19 16:18:53 2018] graph based overlapping [Fri Jan 19 16:19:39 2018] 389105 [Fri Jan 19 16:19:39 2018] 18555 candidates [Fri Jan 19 16:19:39 2018] Done, 598 new overlaps [Fri Jan 19 16:19:39 2018] anchoring based overlapping [Fri Jan 19 16:19:41 2018] 389105 [Fri Jan 19 16:19:41 2018] 0 candidates [Fri Jan 19 16:19:41 2018] Done, 0 new overlaps

[Fri Jan 19 16:19:41 2018] iteration 3 [Fri Jan 19 16:19:41 2018] bulding edges building edges 435744 fine overlaps [Fri Jan 19 16:19:41 2018] Done [Fri Jan 19 16:19:41 2018] calculating edge coverage ... [Fri Jan 19 16:19:42 2018] removed 50023 duplicate edges [Fri Jan 19 16:19:42 2018] Done [Fri Jan 19 16:19:42 2018] masked 85482 contained reads [Fri Jan 19 16:19:42 2018] masked 52612 low coverage (<1) edges [Fri Jan 19 16:19:42 2018] 'best_overlap' cut 490258 non-best edges [Fri Jan 19 16:19:42 2018] graph based overlapping [Fri Jan 19 16:19:44 2018] 389105 [Fri Jan 19 16:19:44 2018] 89 candidates [Fri Jan 19 16:19:44 2018] Done, 3 new overlaps [Fri Jan 19 16:19:44 2018] anchoring based overlapping [Fri Jan 19 16:19:46 2018] 389105 [Fri Jan 19 16:19:46 2018] 0 candidates [Fri Jan 19 16:19:46 2018] Done, 0 new overlaps

[Fri Jan 19 16:19:46 2018] iteration 4 [Fri Jan 19 16:19:46 2018] bulding edges building edges 435747 fine overlaps [Fri Jan 19 16:19:46 2018] Done [Fri Jan 19 16:19:46 2018] calculating edge coverage ... [Fri Jan 19 16:19:47 2018] removed 50026 duplicate edges [Fri Jan 19 16:19:47 2018] Done [Fri Jan 19 16:19:47 2018] masked 85482 contained reads [Fri Jan 19 16:19:47 2018] masked 52612 low coverage (<1) edges [Fri Jan 19 16:19:47 2018] 'best_overlap' cut 490264 non-best edges [Fri Jan 19 16:19:47 2018] graph based overlapping [Fri Jan 19 16:19:48 2018] 389105 [Fri Jan 19 16:19:48 2018] 0 candidates [Fri Jan 19 16:19:48 2018] Done, 0 new overlaps [Fri Jan 19 16:19:48 2018] anchoring based overlapping [Fri Jan 19 16:19:50 2018] 389105 [Fri Jan 19 16:19:50 2018] 0 candidates [Fri Jan 19 16:19:50 2018] Done, 0 new overlaps ~/smartdenovo/wtclp -i TestSample.zmo.ovl.short -i TestSample.zmo.gbo.short -fo TestSample.zmo.obt -F -d 2 [Fri Jan 19 16:19:51 2018] loading alignments [Fri Jan 19 16:19:53 2018] 922385 [Fri Jan 19 16:19:53 2018] Done, 243517 reads, 922385 overlaps [Fri Jan 19 16:19:53 2018] clipping based on overlap depth Before: legal overlaps = 430385 After: legal overlaps = 133017 [Fri Jan 19 16:19:54 2018] Done [Fri Jan 19 16:19:54 2018] iteration 1 2676 reads were filtered by connection-checking 278 reads were truncated by chimera-checking legal overlaps = 135341 [Fri Jan 19 16:19:54 2018] iteration 2 4 reads were filtered by connection-checking 11 reads were truncated by chimera-checking legal overlaps = 135387 [Fri Jan 19 16:19:54 2018] iteration 3 0 reads were filtered by connection-checking 0 reads were truncated by chimera-checking legal overlaps = 135387 [Fri Jan 19 16:19:54 2018] Done

== Message for debug == Sequence coverage statistic: 1046073 10824668 16129060 16678696 13701160 9843146 6814368 4611437 3192178 2359773 1663221 1197984 962899 740646 677049 652907 564440 538527 466340 412399 420808 370740 321274 308875 283371 255599 232553 245738 239084 226107 202768 188779 176133 167918 182581 155351 143535 131581 137333 127253 119882 120116 114681 113873 92210 109550 99129 91885 99879 94600 101932 91627 84404 83529 92875 86413 76855 69795 62975 66319 63799 66225 68975 69673 65977 68269 58591 56706 61120 57096 50637 60998 53594 52469 54132 54345 50195 48843 41447 46475 47198 52709 49962 52236 42375 47926 49525 49452 46554 45652 41429 36299 35484 35829 41193 41163 31880 27408 29407 34106

Total aviable sequences: 961524040 bp Average Coverage(?): 4 Genome Size(?): 240381010 bp

[Fri Jan 19 16:19:54 2018] output [Fri Jan 19 16:19:54 2018] Done ~/smartdenovo/wtlay -i TestSample.fa.gz -b TestSample.zmo.obt -j TestSample.zmo.ovl.short -j TestSample.zmo.gbo.short -fo TestSample.zmo.lay -s 200 -m 0.6 -R -r 1 -c 1 [Fri Jan 19 16:19:54 2018] loading reads [Fri Jan 19 16:20:54 2018] Done, 389105 reads [Fri Jan 19 16:20:54 2018] loading reads obt information [Fri Jan 19 16:20:54 2018] Done [Fri Jan 19 16:20:54 2018] loading alignments loaded 166589 overlaps building edges 116045 fine overlaps [Fri Jan 19 16:20:55 2018] Done [Fri Jan 19 16:20:55 2018] calculating edge coverage ... [Fri Jan 19 16:20:56 2018] removed 444 duplicate edges [Fri Jan 19 16:20:56 2018] Done [Fri Jan 19 16:20:56 2018] masked 27258 contained reads [Fri Jan 19 16:20:56 2018] masked 14912 low coverage (<1) edges [Fri Jan 19 16:20:56 2018] 'best_overlap' cut 100267 non-best edges 1951 tips, 203 bubbles, 2 chimera, 187 non-bog, 0 recoveries [Fri Jan 19 16:20:56 2018] repair 2341 bog elements 67 tips, 1 bubbles, 0 chimera, 5 non-bog, 0 recoveries [Fri Jan 19 16:20:57 2018] repair 73 bog elements 7 tips, 0 bubbles, 0 chimera, 0 non-bog, 0 recoveries [Fri Jan 19 16:20:57 2018] repair 7 bog elements 1 tips, 0 bubbles, 0 chimera, 0 non-bog, 0 recoveries [Fri Jan 19 16:20:57 2018] repair 1 bog elements 0 tips, 0 bubbles, 0 chimera, 0 non-bog, 0 recoveries [Fri Jan 19 16:20:57 2018] generated 208703 unitigs [Fri Jan 19 16:20:57 2018] recovered 878 edges inter unitigs 724 tips, 17 bubbles, 0 chimera, 15 non-bog, 0 recoveries [Fri Jan 19 16:20:57 2018] repair 756 bog elements 6 tips, 1 bubbles, 0 chimera, 1 non-bog, 0 recoveries [Fri Jan 19 16:20:57 2018] repair 8 bog elements 0 tips, 0 bubbles, 0 chimera, 0 non-bog, 0 recoveries [Fri Jan 19 16:20:58 2018] generated 209101 unitigs [Fri Jan 19 16:20:58 2018] recover 62 edges inter unitigs [Fri Jan 19 16:21:33 2018] output 329 independent unitigs [Fri Jan 19 16:22:02 2018] Done

ruanjue commented 6 years ago

Sorry for so late replay.

You need to run wtcns to get consensus sequences from .lay file. Also have a look at SMARTdenovo.pl -c 1.