Open KamilSJaron opened 7 years ago
Hi, can you be more specific about the support you'd like to have for SVs?
Hi, mainly I wanted to point out that current output is more than misleading (more detailed bellow).
I would like to see, at the count is different types of SVs for a start. Then it would be nice to see histograms of sizes per category (careful, basically every SV has a unique size, perhaps you could use bins for historgrams). I do not have multisample vcf file yet, so I am not sure how to sumarize that, but generally - to get a first glance what is inside.
Current behaviour:
Now a have a vcf
file and I would like to get a quick overview what is inside. I know (now), that there are 143490 breakends, 1390 deletions, 883 duplications, 893 insertions and 90 inversions.
bcftools stats
reports instead of these number this :
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 146746
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 0
SN 0 number of MNPs: 1
SN 0 number of indels: 75478
SN 0 number of others: 1967
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
These numbers seems to be wrong. I do not even get how they are computed. It also shows distribution of InDels, but it is not corresponding to fields INS, DEL, but BND sv type.
The program only variants based on the REF and ALT field, small insertions and deletions. Structural variation is not supported at all at the moment. Simple stats like that should not be difficult to add, but this is unlikely to be added anytime soon, unless someone wants to contribute. Pull requests are welcome!
It seems that
bcftools stats
do not work properly with vcf file v4.1 (version allowing structural variants).It would be nice to mention in manual / readme / help page that bcftools do not work with structural variants.
It would be even nicer to implement support for SVs!