Clinical-Genomics / MIP

Mutation Identification Pipeline. Read the latest documentation:
https://clinical-genomics.gitbook.io/project-mip/
MIT License
42 stars 10 forks source link

gnomAD-SV integration - heads up! #760

Closed dnil closed 5 years ago

dnil commented 5 years ago

https://www.biorxiv.org/content/10.1101/578674v1. 15k genomes with SV from gnomAD. Annotations a little further diversified towards the complex as well it seems, so expect some integration of tools to follow. Feel free to start working on it, but it is also something that I and @J35P312 will likely pursue. I couldn't find the actual data on gnomAD yet; guessing to be released when paper accepted?

henrikstranneheim commented 5 years ago

Fabulous! Will follow and include when appropriate.

J35P312 commented 5 years ago

Looks nice! I see they run Manta, Delly, cnvnator, as well as some other callers, as such, it should match MIP very nicely! They run the pipeline presented here:

An analytical framework for whole-genome se- quence association studies and its implications for autism spectrum disorder.

dnil commented 5 years ago

VCF at the very bottom of the page here: https://gnomad.broadinstitute.org/downloads/

dnil commented 5 years ago

https://storage.googleapis.com/gnomad-public/papers/2019-sv/gnomad_v2_sv.sites.vcf.gz

Says "sites", but kind of hope they have frequencies there as well?

J35P312 commented 5 years ago

Thanks! yes, it do have a lot of frequency information! I have tested on a single file using SVDB as annotation tool, It seems to work relatively well! I will make some plots next week =P

J35P312 commented 5 years ago

I must add that it is a nice database, but it contains less than 500 000 SVs, the swegen dataset contains 1 654 551 SV, and our old NGI db contains 1 044 410 SV, so it is either filtered to contain only high quality calls, or heavily merged.

henrikstranneheim commented 5 years ago
henrikstranneheim commented 5 years ago

Done