lgmgeo / AnnotSV

Annotation and Ranking of Structural Variation
GNU General Public License v3.0
220 stars 34 forks source link

Query on the allele frequency #188

Closed priyambial123 closed 1 year ago

priyambial123 commented 1 year ago

Hello,

Benign allele frequency from GnomAD and other databases gives allele frequency starting from 0.01 and column with blank values. Does this mean these databases don't have allele frequency less than 0.01?.

Thank you

lgmgeo commented 1 year ago

Hi @priyambial123,

Actually, common variants can be considered as benign, not the rare variants.

AnnotSV let you the possibility to change the threshold (AF > 0.05 (i.e. 5%), AF > 0.01 (i.e. 1%)…) any time. But not with a too low frequency. That's why rare variants are not reported in GnomAD and other databases for benign analyses. image

Best

priyambial123 commented 1 year ago

Thank you. So, in AnnotSV we have information for allele frequency only for benign analyses?. Allele frequency data for rest of the structural variants are not included?. I have this question as I am interested to analyse the rare structural variants and want to make sure if this data is there or filtered out in AnnotSV. In GnomADSV, there are three data sources, for non-neuro, control and SV sites, which one is included in AnnotSV

priyambial123 commented 1 year ago

Hi Véronique,

Is it possible to include the information for rare structural variants (allele frequency less than 0.01) from GnomAD SV and other databases. There are also ultra-rare structural variants with allele frequency 0.0001. Can you advice on this, on how to find information on structural variants with very low allele frequency

Thank you

lgmgeo commented 1 year ago

In GnomADSV, there are three data sources, for non-neuro, control and SV sites, which one is included in AnnotSV

Data sources: Please, look at the README , section "gnomAD benign SV annotations": image

lgmgeo commented 1 year ago

in AnnotSV we have information for allele frequency only for benign analyses?

Absolutely, yes Please, look at the README , section "Annotation columns available in the output file"

lgmgeo commented 1 year ago

Is it possible to include the information for rare structural variants (allele frequency less than 0.01) from GnomAD SV and other databases. There are also ultra-rare structural variants with allele frequency 0.0001.

Users can add their own private annotations to the one already provided by AnnotSV. Please, look at the README , section "d) Custom annotations: External BED annotation files (optional)"

priyambial123 commented 1 year ago

Thank you. I have another query related to the allele frequency in this benign analysis.

What is the percentage of overlap for a structural variant to fall in the B_loss_coord?

Please let me know if i understood this:

For example, there is deletion in chromosome 12 at start position: 12922185 and end position is: 12922462. In the B_loss_source, B_loss_coord and B_loss_AF there is no value (from AnnotSV analysis).

I downloaded and used the nstd166 data (gnomAD_SV data) and found a deletion at start position: 12922195 and at end position at 12922462 and with allele frequency of 0.407035.

Is there a reason why this SV allele frequency is not annotated in the above AnnotSV analysis?

lgmgeo commented 1 year ago

Your SV to annotate (12:12922185-12922462) is not completely overlapped with the SV from nstd166 (12:12922195-12922462). So the SV from nstd166 is not reported in the B_loss_coord.

Indeed, no way to classify your SV as benign with the SV from nstd166. You may imagine that 12:12922185:12922195 is a region that could become pathogenic if deleted.

cf README: image