rwdavies / QUILT

GNU General Public License v3.0
45 stars 10 forks source link

Hello, I have some question? #1

Open chulghim opened 3 years ago

chulghim commented 3 years ago

My name is namcheol kim. I am going to do a LPS imputation test based on 1kgp3 panel.

My input : sample1_R1.fastq.gz sample2_R2.fastq.gz

Panel : 1kgp3 panel

I read this url.(https://github.com/rwdavies/QUILT) I have some question.

1) Where can i download the genetic_map_file or How to create it?

2) How to create 3 bam files? xxx.haplotagged.bam xxx.ont.bam xxx.illumina.bam

3) How to create xxx.phasefile.txt?

4) How to create xxx.posfile.txt?

Sorry for my poor english.

Thanks.

rwdavies commented 3 years ago

Hi,

Apologies, for some reason I wasn't watching the repo before, so didn't notice this issue until now when I checked manually.

  1. A genetic map file for humans can be obtained from a few sources. Usually it will be provided with whatever source you used to download reference data. In this example file I give an FTP location for a 1000 Genomes based one https://github.com/rwdavies/QUILT/blob/master/example/QUILT_hla_reference_panel_construction.Md#recombination-rate I often use CEU for European work, but perhaps an African population is more suitable for arbitrary use regardless of the background of the subject being imputed (most methods including QUILT should be fairly robust to this)

  2. If your samples are available in fastq, then you'll need to map them, and then pre-process them to obtain useable bam files. This is a fair bit of work if you haven't done this already. Tutorials are available on sites like this https://gatk.broadinstitute.org/hc/en-us/articles/360039568932--How-to-Map-and-clean-up-short-read-sequence-data-efficiently or just generally reading the (now somewhat old, but pretty good) GATK flagship from a few years ago https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083463/

  3. xxx.phasefile.txt would be optional, if you have truth data for your sample, and it included phasing, you could make this file. Most of the time, you won't have this file, so it can be omitted

  4. xxx.posfile.txt species the sites you want to impute. It is only necessary if you're also giving the phasefile. If you're not, you don't need to give this file. It would in any case probably be the set of sites from your reference panel (in your case 1kgp3)

chulghim commented 3 years ago

Thank you for the reply. If i have any questions while doing it, I'll ask again.

Best,

chul

2021년 6월 25일 (금) 오전 12:46, rwdavies @.***>님이 작성:

Hi,

Apologies, for some reason I wasn't watching the repo before, so didn't notice this issue until now when I checked manually.

1.

A genetic map file for humans can be obtained from a few sources. Usually it will be provided with whatever source you used to download reference data. In this example file I give an FTP location for a 1000 Genomes based one

https://github.com/rwdavies/QUILT/blob/master/example/QUILT_hla_reference_panel_construction.Md#recombination-rate I often use CEU for European work, but perhaps an African population is more suitable for arbitrary use regardless of the background of the subject being imputed (most methods including QUILT should be fairly robust to this) 2.

If your samples are available in fastq, then you'll need to map them, and then pre-process them to obtain useable bam files. This is a fair bit of work if you haven't done this already. Tutorials are available on sites like this

https://gatk.broadinstitute.org/hc/en-us/articles/360039568932--How-to-Map-and-clean-up-short-read-sequence-data-efficiently or just generally reading the (now somewhat old, but pretty good) GATK flagship from a few years ago https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083463/ 3.

xxx.phasefile.txt would be optional, if you have truth data for your sample, and it included phasing, you could make this file. Most of the time, you won't have this file, so it can be omitted 4.

xxx.posfile.txt species the sites you want to impute. It is only necessary if you're also giving the phasefile. If you're not, you don't need to give this file. It would in any case probably be the set of sites from your reference panel (in your case 1kgp3)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rwdavies/QUILT/issues/1#issuecomment-867746465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADZSCMDKCPW3SFFK3GKIBOTTUNHOTANCNFSM46PJMK5A .