zhengxwen / SNPRelate

R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
http://www.bioconductor.org/packages/SNPRelate
98 stars 25 forks source link

Arbitrary chromosome numbers #15

Closed mycecilia closed 9 years ago

mycecilia commented 9 years ago

Hi,

I have data from plants with 33 chromosomes. There is feature in PLINK to use more than 26 chromosomes. Could you include that in SNPRelate? I ran snpgdsLDpruning(), and it automatically excluded the chromosomes after 22.

Thank you.

zhengxwen commented 9 years ago

How do you import the genotypes? from PLINK BED files?

snpgdsBED2GDS(bed.fn, fam.fn, bim.fn, out.gdsfn, family=FALSE, snpfirstdim=NA,
    compress.annotation="ZIP_RA.max", compress.geno="", option=NULL,
    cvt.chr=c("int", "char"), cvt.snpid=c("auto", "int"), verbose=TRUE)

The argument cvt.chr="char" allows any chromosome coding.

smgogarten commented 9 years ago

snpgdsLDpruning has a default option autosome.only=TRUE. You can set that to FALSE or a vector to include any set of chromosomes. Also see snpgdsOption() to change the definition of autosomes - the default is 1-22 but you can set it to something else.