DReichLab / EIG

Eigen tools by Nick Patterson and Alkes Price lab
Other
180 stars 60 forks source link

.snp file format #84

Open Liuyh275 opened 1 year ago

Liuyh275 commented 1 year ago

I wanna to understand if the v54.1_1240K_public.snp file column in Eigenstrat format represents "SNP ID","Chromosome","Position in centimorgans","Base-pair coordinate","ALT allele","REF allele"? or the last two column mean "REF allele" and "ALT allele"? head(file): rs3094315 1 0.020130 752566 G A rs12124819 1 0.020242 776546 A G rs28765502 1 0.022137 832918 T C rs7419119 1 0.022518 842013 T G rs950122 1 0.022720 846864 G C

File link :https://reichdata.hms.harvard.edu/pub/datasets/amh_repo/curated_releases/V54/V54.1/SHARE/public.dir/v54.1_1240K_public.snp

bumblenick commented 1 year ago

I prefer to call the last 2 columns "count allele" and "alt allele" For a snp with these columns as "A T" the genotype file counts the A.. Good practice is to include the human reference in the sample list (.ind) as there is no guarantee that the count allele matches reference.

Nick Patterson

On Mon, Jan 9, 2023 at 8:37 AM Liuyh275 @.***> wrote:

I wanna to understand if the.snp file column in Eigenstrat format represents "Chromosome","Variant ID","Position in centimorgans","Base-pair coordinate","ALT allele","REF allele"? or the last two column mean "REF allele" and"ALT allele"? [image: image] https://user-images.githubusercontent.com/122274407/211320297-7bdb432e-16e6-437b-a2bb-9fd8b4feda44.png This screenshot is come from public file"v54.1_1240K_public.snp"

— Reply to this email directly, view it on GitHub https://github.com/DReichLab/EIG/issues/84, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEE77B2ACJEJ2ZW2AD4KAULWRQICLANCNFSM6AAAAAATVOEEH4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>