Closed MehmetGoktay closed 4 years ago
No, this is not possible on the web version, but you are welcome to run it locally and modify the code to your purposes.
The limits were set as described in the paper: "Assemblytics identifies all insertion and deletion variants as small as 1 bp up to a maximum of 10 kbp in size, with this maximum adjusted to match the size of the unique sequence anchor. This prevents translocations and complex variants from being interpreted as indels." You can read the rest for context at https://academic.oup.com/bioinformatics/article/32/19/3021/2196631. Especially see the supplement as it explains what the unique sequence anchor is and why it is important.
Best of luck with your research, Maria
Hi Maria, Thanks for maintaining Assemblytics - It's a great piece of intuitive software! I do have one question regarding the max size that isn't clear to me after reading the supplemental note. If I increase both the "Unique sequence length required" and the "Maximum variant size" to greater than 10kb will that still be accurate? Or is 10kb a safety net to minimize incorrectly calling errors due to repeats? Basically, I anticipate seeing SVs much larger than 10kb in my comparison. Is Assemblytics still appropriate for use after tweaking both params? Are there any specifically different nucmer params that you would recommend for this? Thanks, Ben
Hi @bmansfeld
Above 10kb it will still be doing the same analysis of the MUMmer alignments, but yes, I put those limits in place to avoid calling above a size where I am no longer sure the interpretation of those variants holds. For instance, very large variants are increasingly likely to be caused by misassemblies. Assemblytics does not call variants other than indels and repeat expansions/contractions, so translocations or inversions might be called something else once you go beyond the safety net. It is a safety net you are welcome to go around, but I recommend you take a closer look at the alignments that produced those variants to see if the interpretation makes sense. I made this visualization tool https://dot.sandbox.bio/ that might be helpful for that analysis.
Thanks for clearing that up Maria, I'll take a look at Dot. Stay safe! -Ben
Hi Maria,
Is there a way to set maximum variant size bigger than 100k?
Apparently this is the limit for assemblytics web server.
Best, Mehmet