asylvz / SVarp

Phased structural variant discovery in pangenomes
MIT License
31 stars 1 forks source link

how to interprete the results? #3

Closed wnddl111 closed 5 months ago

wnddl111 commented 5 months ago

Hello,

I would like to use the files available at the following link: https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1KG_ONT_VIENNA/release/v1.0/final-vcf/final-vcf.phased.vcf.gz.

However, I am having difficulty understanding the notation used for the variant descriptions, specifically: chr1-21837-COMPLEX->s65863s1192-104.

Could you please provide an explanation or direct me to any resources that describe this format in detail?

Thank you very much for your assistance.

Best regards,

Juyoung lee

asylvz commented 5 months ago

Hi, It's a naming that we used for the variants. To be more precise, for a variant such as "chr1-21837-COMPLEX->s65863s1192-104":

The convention that we used is: chromosome name - position relative to T2T - variant type (either INS, DEL or COMPLEX) - traversal of this variant in the graph (path) - length of the allele (in base pairs)

Best, Arda