zhengxwen / SNPRelate

R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
http://www.bioconductor.org/packages/SNPRelate
101 stars 25 forks source link

File format of groups file for Admixture analysis #36

Open Tman3 opened 6 years ago

Tman3 commented 6 years ago

Please advise what is the proper format for the "groups" file as it required for the Admixture analysis? According to the HClustering analysis, I may have 1 to 2 trees. I have a total of 19 samples to build the input vcf file. A attempted to use the sample,id as list for the groups file, but it returned with an error message indicating there is a problem with the groups matrix.

zhengxwen commented 6 years ago

show me the error message.

Tman3 commented 6 years ago

Here is the error message... prop <- snpgdsAdmixProp(RV, groups=groups)

Error in solve.default(mat[-length(groups), ] - matrix(T.P, nrow = length(T.P), : 'a' (20 x 1) must be square In addition: Warning message: In mat[-length(groups), ] - matrix(T.P, nrow = length(T.P), ncol = length(T.P), : Recycling array of length 1 in vector-array arithmetic is deprecated. Use c() or as.vector() instead.

str(groups) List of 21 $ A: chr "37" $ B: chr "38" $ C: chr "39" $ D: chr "40" $ E: chr "41" $ F: chr "42" $ G: chr "43" $ H: chr "44" $ I: chr "45" $ J: chr "46" $ K: chr "47" $ L: chr "48" $ M: chr "49" $ P: chr "51" $ Q: chr "52" $ R: chr "53" $ S: chr "54" $ T: chr "55" $ V: chr "57" $ N: chr "50" $ O: chr "56"

zhengxwen commented 6 years ago

It seems that there are too many groups in the variable groups. If you have three ancestral populations, you only need A, B and C in the variable groups.