Closed clairemerot closed 3 years ago
As they are "copygain", they should be twice in the query genome right?
Yes
Does this mean that the 1st copy of the duplicated region is present at the same position as in the reference (e.g. Chr01)
Most of the times but not always. As it is possible that the first copy is inverted or translocated, SyRI will identify and report them as such.
the second position indicated for query (eg. chr33?)
Yes
Same question for INVDUP.
Same as DUP
If you want to check the original loci of DUP regions in details, that you can try using the regAnno
script (in the bin folder) that I wrote for personal testing. You can use it to fetch the annotations of a genomic region. However, it only reads the intermediate file (synOut.txt, invOut.txt etc) with exactly these names. Run it in your working directory:
$ regAnno region Chr01 215099 218756
synOut.txt Chr01 1 215102 Chr01 1 215102
TLOut.txt Chr01 215099 218756 Chr01 39075877 39079534
synOut.txt Chr01 218750 307911 Chr01 218750 307911
Thanks a lot for your answer, I'll try to figure out.
I'm sorry, I have been trying to figure this out but I don't get it. If I want to build an informed vcf with reference sequence and alternative(query) ref for those complex rearrangements (DUP, INVDUP, TRANSINV, etc). Is this possible? Or would you recommend against it? Or should I do it but without the rearrangements that fall on two different chromosomes? I am really sorry to bother you with this, and I really appreciated the tool plus the follow-up on github. Thanks! Claire
SyRI should have generated a VCF output file that contains all structural rearrangements. Or do you need something different?
On Wed 20 Oct, 2021, 7:59 PM Claire Mérot, @.***> wrote:
I'm sorry, I have been trying to figure this out but I don't get it. If I want to build an informed vcf with reference sequence and alternative(query) ref for those complex rearrangements (DUP, INVDUP, TRANSINV, etc). Is this possible? Or would you recommend against it? Or should I do it but without the rearrangements that fall on two different chromosomes? I am really sorry to bother you with this, and I really appreciated the tool plus the follow-up on github. Thanks! Claire
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/schneebergerlab/syri/issues/102#issuecomment-947906171, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD3ZK32Q7TF7OQJZWPFSVB3UH37OPANCNFSM5GJWZEMQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Yes, the vcf is fine, but it includes REF/ALt sequences only for small SV. I was trying to get sequences also for the larger rearrangements like INV, DUP, etc. Thanks a lot
I see, thanks for clarifying. I don't think there is a direct answer to this question and it would rather depend on your objectives. As far as I understand VCF format, I think you can add inversions as Alt Sequence and translocations as Breakpoints but I am not sure how duplications would work. However, in my opinion, structural rearrangements (SRs) (INV, TRANS, DUP etc) are sequences that have "same" sequence but at different location or with different orientation. So, an ideal SR (without any nested small SV) would actually have exactly the same sequence. Therefore, there would be nothing like Ref vs Alt sequence. Practically, I would guess that adding this sequence might make the VCF file unnecessarily large without adding much more information. I hope this helps.
Hello, I want to make sure that I understand the output. for the regions that are annotated as "DUP", I have:
As they are "copygain", they should be twice in the query genome right? Yet, here it refers to only one position in the query genome. Does this mean that the 1st copy of the duplicated region is present at the same position as in the reference (e.g. Chr01) + the second position indicated for query (eg. chr33?). Same question for INVDUP. Thanks for your help! Claire