andrewparkermorgan / argyle

An R package for import, QC and analysis of Illumina Infinium genotyping arrays
32 stars 10 forks source link

How to create the marker map from the SNP_map.txt file #8

Closed amizeranschi closed 5 years ago

amizeranschi commented 5 years ago

Hello,

I'm trying to read data from the GGP Bovine 50K array from Neogen into Argyle. Referring to the read.beadstudio method documentation, I'm having a hard time figuring out how to properly set the "A1" (REF) and "A2" (ALT) values for the snps marker map.

The SNP_map.txt file looks like this:

Index   Name    Chromosome  Position    GenTrain Score  SNP ILMN Strand Customer Strand NormID
1   ARS-BFGL-BAC-10919  14  31267746    0.7455  [A/G]   TOP TOP 0
2   ARS-BFGL-BAC-10975  10  21225382    0.7042  [A/G]   TOP TOP 0
3   ARS-BFGL-BAC-11000  10  79252023    0.8459  [T/G]   BOT BOT 0
4   ARS-BFGL-BAC-11003  10  80410977    0.8801  [T/C]   BOT BOT 0
5   ARS-BFGL-BAC-11025  10  84516867    0.856   [T/G]   BOT BOT 0
6   ARS-BFGL-BAC-11044  1   12805406    0.8861  [T/C]   BOT BOT 0
7   ARS-BFGL-BAC-11193  1   29303546    0.8123  [T/C]   BOT TOP 0
8   ARS-BFGL-BAC-11215  12  90704572    0.7441  [A/G]   TOP TOP 0
9   ARS-BFGL-BAC-11218  1   24549757    0.8716  [A/G]   TOP TOP 0

How should the SNP column above be interpreted? I noticed that the alleles mentioned there are, sometimes, the reverse complement of the alleles that actually appear in the FinalReport.txt file.

How can I get the A1 and A2 values for the marker map, using the columns SNP, ILMN Strand and Customer Strand from the SNP_map.txt file?

amizeranschi commented 5 years ago

Answered here: https://github.com/andrewparkermorgan/argyle/issues/5#issuecomment-474883197.