Closed niyati1211 closed 6 months ago
The way GWASTools is written, snpID and chromosome must always be integers. From the man page of GdsGenotypeReader:
The GDS file must contain the following variables: • 'snp.id': a unique integer vector of snp ids • 'snp.chromosome': integer chromosome codes
Default values for chromosome codes are 1-22=autosome, 23=X, 24=XY, 25=Y, 26=M. The defaults may be changed with the arguments ‘autosomeCode’, ‘XchromCode’, ‘XYchromCode’, ‘YchromCode’, and ‘MchromCode’.
Using getChromosome(gds_object, char=TRUE) will return a character vector with X instead of 23, etc. I suggest you create a data frame with integer snpID, character chromosome and rsID columns, and merge it with the results of assocRegression (see documentation for base R merge
or dplyr left_join
).
I am running assocRegression under GWASTools. My genotype data is in ped/map format.
I ran the following code to create the gds object:
geno_gds <- snpgdsPED2GDS(ped.fn = "random35k_new.ped" , map.fn = "random35k_new.map", out.gdsfn = "/random35k_new.gds") gds_object <- GdsGenotypeReader("random35k_new.gds")
When I run assocRegression:
In my output "res", the values under the snpID column are integers (1,2,3,4,5), which I believe are from my GDS object. The integers are hard to interpret and don't tell me anything about the location/ref/alt allele of the SNP. I am not sure why the snpIDs are converted to integers when my map file does include the rsIDs...
I tried using SnpAnnotationDataFrame but I get various errors one being that chromosome must be integer even though I specify it not to be character (chromosome <- getChromosome(gds_object, char=FALSE).
How should I go about running this analysis so that in my output file, the snpID column values are the rsIDs from my map file? Is there a way to update the snpID object within the GDS? Any help will be appreciated!
Thank you!