odelaneau / GLIMPSE

Low Coverage Calling of Genotypes
MIT License
139 stars 26 forks source link

How do you fetch files for all other chromosomes for GLIMPSE_validation #20

Closed vkulkarni-invitae closed 4 years ago

vkulkarni-invitae commented 4 years ago

Hello, Thank you for providing such a wonderful tool.

The GLIMPSE validation folder contains two files already downloaded and prepped mainly {gnomad.genomes.r3.0.sites.chr22.isec.bcf , /EUR.validation.NA12878.chr22.bcf} Do you mind adding in the README how to fetch and/or prepare these for other chromosomes?

Thank you. Vinayak.

srubinacci commented 4 years ago

Hi Vinayak,

Thanks for the interest in our method.

For the validation files you could follow appendix A1 in the tutorial, changing the name of the chromosomes: https://odelaneau.github.io/GLIMPSE/tutorial.html#run_concordance

For the allele frequency file we used the publicly available gnomad v3 files you can find here: https://gnomad.broadinstitute.org/downloads (I remark we used version 3 sites files, the website is a bit confusing) The files are very big, so we kept only the relevant information in the INFO field and intersected the sites with our validation as we needed to deliver the files. However, this pre-processing is not strictly necessary.

I agree with you, and we should probably add a README in the future, especially for the pre-processing of the gnomad v3 files. Hope this helps.

Best wishes, Simone

vkulkarni-invitae commented 4 years ago

Many thanks. So seems like if file sizes is not a concern, we could just keep the entire GNOMAD file and given the ancestry in concordance part of the code, it would choose the right ones from the file?

srubinacci commented 4 years ago

Exactly. Just use the appropriate AF field using the "info_af" parameter in GLIMPSE_concordance.

Best wishes,

Simone

dangcaptkd2 commented 2 months ago

Hi there,

Thank you for offering such an excellent tool.

It seems that the link you provided above https://odelaneau.github.io/GLIMPSE/tutorial.html#run_concordance is no longer accessible. Could you please provide an updated link? I'm also interested in learning how to fetch and/or prepare the files listed in concordance.lst in the QUILT2 tutorial.

Thank you!