Dear Ruanjue,
In the log I noticed the following:
[Fri Feb 14 22:22:02 2020] TOT 1321464576, CNT 51873, AVG 25475, MAX 853248, N50 71936, L50 4603, N90 11520, L90 22412, Min 512
[Fri Feb 14 22:22:03 2020] Estimated: TOT 1372882944, CNT 28962, AVG 47403, MAX 2028544, N50 145408, L50 2312, N90 19456, L90 13048, Min 1792
Which looks like WTDBG2 is able to estimate the assembly size to be ~1.37 Gb. Given our current low coverage (~14X), this seems like a pretty good estimate compare to the cytology data of 1.53Gb. However, the actual bases inside the *.cns.fa file added up to only 1196875092 bp.
So is the explanation that some of these contigs are actually repeat edges in the assembly graph, and they are only represented once in the final assembly file? If so, is there a way to know which contigs are likely contracted repeats?
Dear Ruanjue, In the log I noticed the following: [Fri Feb 14 22:22:02 2020] TOT 1321464576, CNT 51873, AVG 25475, MAX 853248, N50 71936, L50 4603, N90 11520, L90 22412, Min 512 [Fri Feb 14 22:22:03 2020] Estimated: TOT 1372882944, CNT 28962, AVG 47403, MAX 2028544, N50 145408, L50 2312, N90 19456, L90 13048, Min 1792
Which looks like WTDBG2 is able to estimate the assembly size to be ~1.37 Gb. Given our current low coverage (~14X), this seems like a pretty good estimate compare to the cytology data of 1.53Gb. However, the actual bases inside the *.cns.fa file added up to only 1196875092 bp. So is the explanation that some of these contigs are actually repeat edges in the assembly graph, and they are only represented once in the final assembly file? If so, is there a way to know which contigs are likely contracted repeats?
Thank you so much for this great software!