Closed el-rs closed 2 months ago
Also am I right in assuming the SNPs in genotype matrix are ordered the same as input list?
Yes
How would one map the genotypes back to individual sample EIDs?
Use match()
with the EIDs from the both the CSV file and the sample file.
The only thing I could store in the $fam
is what is provided to me: the indices (not the EIDs) of the samples read from the BGEN/sample. Would that help? I am not sure.
Usually what I do is that I directly fill the $fam
after reading, and save the expanded object.
Cf. e.g. https://github.com/privefl/paper-infer/blob/main/code/prepare-geno-simu.R#L49-L52
This helps! To clarify, when using the sample indices filter, the $genotype object would contain samples in the same order as given in ind_row argument? Since there is no specific id variable by which to merge genotype with samples.
Also, what could be the reason $fam is not getting generated?
The dosages obtained in $genotype are for allele2?
Thanks again!
ind_row
Thanks for this! I'm still not sure how to get the correct allele the dosages are referring to. Is it always the reference allele or alternate? Could you please provide further guidance on this? Thank you, really appreciate your help!
I do not have a definitive answer. You should check using some reference, either being GWAS hits, or maybe simply allele frequencies.
Hi, I'm using the snp_readBGEN() function to read in UKB imputed data. This generates the $genotype and $map objects, but not $fam. How would one map the genotypes back to individual sample EIDs? Also am I right in assuming the SNPs in genotype matrix are ordered the same as input list? Thanks!