ibest / ARC

Assembly by Reduced Complexity (ARC)
Apache License 2.0
41 stars 5 forks source link

repetitive element detection #5

Closed msettles closed 10 years ago

samhunter commented 11 years ago

This is implemented but not tested. By default it stops running a Target if the number of reads increases by a factor of 5 for that Target, meaning: total_current_reads/total_previous_reads >= 5 This can be set in the config like:

max_incorporation=5

This is a little bit difficult to manufacture a test for.

msettles commented 11 years ago

I may have one, we can look today


Matt Settles, PhD Bioinformatics Scientist Director, Genomics Resources Core Institute for Bioinformatics and Evolutionary STudies (IBEST) University of Idaho Moscow, Idaho 83844

On Jun 21, 2013, at 1:17 AM, Sam Hunter notifications@github.com<mailto:notifications@github.com> wrote:

This is implemented but not tested. By default it stops running a Target if the number of reads increases by a factor of 5 for that Target, meaning: total_current_reads/total_previous_reads >= 5 This can be set in the config like:

max_incorporation=5

This is a little bit difficult to manufacture a test for.

— Reply to this email directly or view it on GitHubhttps://github.com/ibest/ARC/issues/5#issuecomment-19803481.

samhunter commented 11 years ago

Still not tested, but implemented as above.

samhunter commented 11 years ago

It turns out that this doesn't seem to be working properly, I need to build a test-case in order to figure out what is going on, but I have observed assemblies which have incorporated some repetitive regions and ballooned to well over a 5x increase in reads without triggering the repetitive element detection code (with mitochondrial assemblies).

samhunter commented 10 years ago

Working now, also tandem repeat masking has greatly reduced this problem.