odelaneau / GLIMPSE

Low Coverage Calling of Genotypes
MIT License
138 stars 26 forks source link

Create referrence #194

Closed Sondr11 closed 8 months ago

Sondr11 commented 9 months ago

Hì. Thank you for your response. I have read the Glimpse manual and I have a few questions: 1, the first of all I have to create a reference panel. preferably a population-specific reference panel. Am I understanding this correctly? 2, I plan to use 20,000 nipt samples as a reference panel, I read your instructions but it talks about creating a reference panel from 1,000 human genome data but can't find instructions on how to create a panel Reference from another data source with a data file of about 20,000 nipt samples in fastq format. Can you suggest me how to create a reference panel from 20,000 nipt samples from fastq file format? Thank you so much

Sondr11 commented 9 months ago

Please help me!

srubinacci commented 9 months ago

Hi,

  1. Yes. But as a rule of thumb, more samples the better and 1000GP is pretty good for most cases.
  2. You might not be able to create a reference panel from 20,000 NIPT samples, as to produce accurate genotypes you might need to impute in the first place. Depending of the population, you might want to perform imputation without a reference panel. One option would be to use the GLIMPSE v1.1.1 model, so that you can condition on all the target samples, AND still use the 1000 Genomes reference panel. Or use other (reference-free) tools such as STITCH.

More info is available here: https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-021-00999-4 https://www.medrxiv.org/content/10.1101/2023.11.26.23299022v1 (and related preprints)

Best,

Simone