Closed shehongbing closed 4 years ago
What is the NextDenovo version you used? and how about the genome size, heterozygous rate, and repeat content? I do not suggested use corrected data, because NextDenovo will correct the raw data and filter some low quality or unuseful seeds. BTW, Could you provide the co-line pictures?
How about the assembly result using canu?
The figure 1 is raw data, figure 2 is corrected data
Figure1
Figure 2
在 2019年12月26日,下午7:01,Hu Jiang notifications@github.com 写道:
What is the NextDenovo version you used? and how about the genome size, heterozygous rate, and repeat content? I do not suggested use corrected data, because NextDenovo will correct the raw data and filter some low quality or unuseful seeds. BTW, Could you provide the co-line pictures?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Nextomics/NextDenovo/issues/44?email_source=notifications&email_token=ALZETBNF2HJIWH7TZQUML4TQ2SFHZA5CNFSM4J7KD762YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHVNK5A#issuecomment-569038196, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALZETBMR2MG4UGB7WKEE5WTQ2SFHZANCNFSM4J7KD76Q.
I can not see the figures, if you have problem to upload the figures to github, you can send the figures to my email: huj_at_grandomics.com. BTW, could you provide the configure file and the assembly log (the last step log)?
Hi, Dr. Hu
I sent it to you few minutes ago
在 2019年12月26日,下午8:04,Hu Jiang notifications@github.com 写道:
I can not see the figures, if you have problem to upload the figures to github, you can send the figures to my email: huj_at_grandomics.com. BTW, could you provide the configure file and the assembly log (the last step log)?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Nextomics/NextDenovo/issues/44?email_source=notifications&email_token=ALZETBLPU54U4V34GS4WXC3Q2SMTTA5CNFSM4J7KD762YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHVPR2Y#issuecomment-569047275, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALZETBMUVGKKBLKSU5TAUUDQ2SMTTANCNFSM4J7KD76Q.
I find your seed_cutoff is very short? How many data you used to do the assembly?
My raw data is about 40 G.
[Read length stat] Types Count (#) Length (bp) N10 47170 73938 N20 110964 59801 N30 187662 50562 N40 277752 43236 N50 382891 37077 N60 505656 31670 N70 650141 26749 N80 823069 22065 N90 1037363 17256
Types Count (#) Bases (bp) Depth (X) Raw 1503768 42060488746 42.92 Filtered 0 0 0.00 Clean 1503768 42060488746 42.92
*Suggested length cutoff of reads (genome size: 980000000, expected seed depth: 40) to be corrected: 15170 bp
在 2019年12月26日,下午9:02,Hu Jiang notifications@github.com 写道:
I find your seed_cutoff is very short? How many data you used to do the assembly?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Nextomics/NextDenovo/issues/44?email_source=notifications&email_token=ALZETBJ6QQTQCIE43NWLEDLQ2STO5A5CNFSM4J7KD762YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHVROJY#issuecomment-569055015, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALZETBL5JAEYSBAHRL43BIDQ2STO5ANCNFSM4J7KD76Q.
Your data is not enough for assembly using the currently version of NextDenovo with default options, because all default options are optimize with 60-100x nanopore data. So it will produce an unexpected assembly result. But if you still want to use NextDenovo to do the assembly, you can try to use the option correction_options = -b
and change -k 30
in sort_options
and than rerun all pipeline, while I can not guarantee you can get a good result. You also can try to other assemblers. I will release a set of preset parameters for assembly with low-depth data in the future.
Thank you
在 2019年12月26日,下午9:22,Hu Jiang notifications@github.com 写道:
Your data is not enough for assembly using the currently version of NextDenovo with default options, because all default options are optimize with 60-100x nanopore data. So it will produce an unexpected assembly result. But if you still want to use NextDenovo to do the assembly, you can try to use the option correction_options = -b and change -k 30 in sort_options and than rerun all pipeline, while I can not guarantee you can get a good result. You also can try to other assemblers. I will release a set of preset parameters for assembly with low-depth data in the future.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Nextomics/NextDenovo/issues/44?email_source=notifications&email_token=ALZETBLELTX2VWJ67LNMJ2TQ2SVY5A5CNFSM4J7KD762YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHVSECI#issuecomment-569057801, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALZETBPCIONGOTH25GOATVLQ2SVY5ANCNFSM4J7KD76Q.
when I used raw Nanopore data, the assemble.fa N50 is about 39 Mb, but the alignment is not good in comparison to published genome. however, when I used collected data (corrected by canu), the assemble.fa N50 is about 2 Mb, and with the good alignment in comparison to the polished genome. So I do not know why. it suggested that should I used the corrected data rather than raw data? and the two methods with the huge different in N50