fakedrtom / SVAFotate

MIT License
38 stars 2 forks source link

SVAFotate bed file resource #9

Closed Mong-yj closed 1 year ago

Mong-yj commented 1 year ago

Hello. This tool is really useful! This issue is not about the issue of the tool, just a question.

I would like to check the variation on the DB using the Variant ID. Are all the IDs in the bed file taken from the DB?? I'd like to check it on 1000G, gnomAD, and CCDG database. Can I know where I can download the used DBs?

Thank you.

fakedrtom commented 1 year ago

I am glad that you find the tool to be useful!

Yes, the SV_IDs that are listed in the provided SVAFotate_core_SV_popAFs.GRCh38.bed.gz file should match the IDs that are used in the CCDG, gnomAD, and 1000G datasets. You should be able to find those original datasets here:

CCDG Supplementary_File_1.zip gnomAD SV 2.1 sites VCF 1000G 1KGP_3202.Illumina_ensemble_callset.freeze_V1.vcf.gz

Please note that CCDG and 1000G are aligned against GRCh38 while gnomAD is aligned against GRCh37. For the SVAFotate_core_SV_popAFs.GRCh38.bed.gz file I used liftover on the gnomAD to align it to GRCh38, but the IDs should be the same.

Mong-yj commented 1 year ago

Thank you very much for your quick and kind response. It helped a lot!