Closed dnil closed 5 years ago
Fabulous! Will follow and include when appropriate.
Looks nice! I see they run Manta, Delly, cnvnator, as well as some other callers, as such, it should match MIP very nicely! They run the pipeline presented here:
An analytical framework for whole-genome se- quence association studies and its implications for autism spectrum disorder.
VCF at the very bottom of the page here: https://gnomad.broadinstitute.org/downloads/
https://storage.googleapis.com/gnomad-public/papers/2019-sv/gnomad_v2_sv.sites.vcf.gz
Says "sites", but kind of hope they have frequencies there as well?
Thanks! yes, it do have a lot of frequency information! I have tested on a single file using SVDB as annotation tool, It seems to work relatively well! I will make some plots next week =P
I must add that it is a nice database, but it contains less than 500 000 SVs, the swegen dataset contains 1 654 551 SV, and our old NGI db contains 1 044 410 SV, so it is either filtered to contain only high quality calls, or heavily merged.
Done
https://www.biorxiv.org/content/10.1101/578674v1. 15k genomes with SV from gnomAD. Annotations a little further diversified towards the complex as well it seems, so expect some integration of tools to follow. Feel free to start working on it, but it is also something that I and @J35P312 will likely pursue. I couldn't find the actual data on gnomAD yet; guessing to be released when paper accepted?