mroosmalen / nanosv

SV caller for nanopore data
MIT License
90 stars 22 forks source link

Genotype information: format field in VCF #44

Closed tgong1 closed 6 years ago

tgong1 commented 6 years ago

Hi,

I'd like to ask that why "DR" and "DV" having two numbers and what each number represent?

FORMAT=

FORMAT=

For example: GT:DR:DV:GQ:HR:PL 1/1:1,0:2,2:4:0,0:109,4,0 GT:DR:DV:GQ:HR:PL 0/1:1,2:2,2:11:0,0:99,0,11 GT:DR:DV:GQ:HR:PL 1/1:0,0:3,3:15:0,0:177,15 GT:DR:DV:GQ:HR:PL 0/1:1,1:2,2:5:0,0:103,0,5

I also like to confirm that SVs with DV greater than "cluster_count" set in config.ini file will be reported in output VCF file.

Thank you very much!

mroosmalen commented 6 years ago

Each line is a breakend/SV and each of them contains two breakpoints. The two numbers are the left (first) and right (second) breakpoint. For example in case of a deletion:


                breakpoint 1   breakpoint 2
ref genome:  =========|___________|========

                  -----            ----
alt reads:         ----            -------

              ------------
ref reads:       ----------  
                                ----------

This will be reported as DV:2,2 and DR:2,1. Breakpoint 1 (CHROM and POS) is 'supported' by 2 alt reads and 2 ref reads (the first numbers of DV and DR. Breakpoint 2 (ALT) is 'suppored' by 2 alt reads and only 1 ref read ( the second numbers of DV and DR)

tgong1 commented 6 years ago

Thank you very much for the clarification!