zhengxwen / SNPRelate

R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
http://www.bioconductor.org/packages/SNPRelate
98 stars 25 forks source link

Output the LD-pruned GDS from SNPRelate for phylogenetic analysis in SNPyhlo #35

Closed Tman3 closed 6 years ago

Tman3 commented 6 years ago

I performed PCA analysis on a vcf file after being LD-pruned in SNPRelate. Would like to determine whether I could output such GDS file from SNPRelate so I could use it as input for SNPhylo. My goal is to use the same LD-pruned GDS file or the revised list of SNPs in the SNPhylo analysis. Your advice is appreciated.

zhengxwen commented 6 years ago

I don't have such function currently. But you can call snpgdsGDS2BED() with a set of selected SNPs to create a BED file, and then call snpgdsBED2GDS() to import the genotypes.

nute11a commented 6 years ago

Hello! I am actually trying to do the same thing as Tman3 (so a function would definitely be great!) Unfortunately, I tried to do what was suggested (convert selected SNPs to BED file, and then import the genotypes) but got this error:

Converting from GDS to PLINK binary PED: Working space: NUM samples, 6826 NUM SNPs Error in if ((opt$autosome.start == 1) & (opt$autosome.end == 22)) { : missing value where TRUE/FALSE needed

The command I used:

snpset <- snpgdsLDpruning(genofile, ld.threshold=0.2, autosome.only = FALSE) snpset.id <- unlist(snpset) snpgdsGDS2BED(genofile, "bed.fn", snp.id=snpset.id) ^^ Error here.

I can't seem to find much about resolving this error. Have you seen it before, and do you have any suggestions moving forward?

Thanks!

zhengxwen commented 6 years ago

See the function snpgdsCreateGenoSet() in the package SNPRelate. I have provided such function in the package.