xinghuq / KLFDAPC

KLFDAPC: Kernel local Fisher discriminant analysis of principal components (KLFDAPC) for large genomic data
https://xinghuq.github.io/KLFDAPC/
Other
4 stars 1 forks source link

Error in matrix(1, N, M) : non-numeric matrix extent when running klfdapc #5

Closed MarcGose closed 2 years ago

MarcGose commented 2 years ago

Hello,

I wanted to try out klfdapc on my SNP dataset, following the tutorial on this Github. Everything seems to work fine until the kfldapc step, where I get the error "Error in matrix(1, N, M) : non-numeric matrix extent"

I tried assigning random labels as in the SARS-Cov-2 tutorial and this worked, so I reckon it must be something with my population codes, but I can't seem to figure out what it is. I would greatly appreciate any help with this.

This is my code:

popsex <- read.table("pop_file.info") pop_file <- popsex$V1

samp.annot <- data.frame(pop_file)

snpgdsVCF2GDS(vcf.fn = "C:/Users/MarcG/OneDrive/Desktop/VCF/WSD_GLs.vcf", out.fn = "WSD_GLs_GDS")

(genofile <- snpgdsOpen("WSD_GLs_GDS", readonly = FALSE))

read.gdsn(index.gdsn(genofile, "sample.id")) read.gdsn(index.gdsn(genofile, "snp.rs.id")) read.gdsn(index.gdsn(genofile, "genotype")) add.gdsn(genofile, "sample.annot", samp.annot) pop_code <- read.gdsn(index.gdsn(genofile, "sample.annot")) pop_code <- read.gdsn(index.gdsn(genofile, path="sample.annot/pop_file")) pop_code=factor(pop_code,levels=unique(pop_code))

pcadata <- SNPRelate::snpgdsPCA(genofile, autosome.only = FALSE)

snpgdsClose(genofile)

normalize <- function(x) { return ((x - min(x)) / (max(x) - min(x))) }

pcanorm=apply(pcadata$eigenvect[,1:20], 2, normalize)

kmat <- kmatrixGauss(pcanorm,sigma=5)

klfdapc=KLFDA(kmat, pop_code, r=3, knn = 2) Error in matrix(1, N, M) : non-numeric matrix extent

xinghuq commented 2 years ago

Hi, most likely your "pop_code" in the "pop_code <- read.gdsn(index.gdsn(genofile, "sample.annot"))" ”pop_code <- read.gdsn(index.gdsn(genofile, path="sample.annot/pop_file"))“ are not uniquely pop factors. Please check these pop_code first, and see what are they. Please use your own real pop labels instead from "”pop_code <- read.gdsn(index.gdsn(genofile, path="sample.annot/pop_file"))“", make sure that knn < the minimum number of individuals in a pop.

Cheers,

Xinghu

MarcGose commented 2 years ago

Thanks Xinghu, that was it!

Just aother short question: Given that requirement for the knn parameter, is there any way to incorporate a population that is respresented by only one individual?

xinghuq commented 2 years ago

The answer is yes. One of the advantages of KLFDAPC is that it can preserve multimodal structures within pops, you can label a single individual to a higher level metapop. For example, if you think it can be merged into a very close pop that is different from other pops when running klfdapc, this can avoid removing single individual, after you get klfdapc features you can then plot it using true individual labels. This is one of the highlights of this method.

Cheers,

Xinghu

MarcGose commented 2 years ago

Thank you so much for your help Xinghu! Excited to play more with this method soon.

Cheers,

Marc

xinghuq commented 1 year ago

The answer is yes. One of the advantages of KLFDAPC is that it can preserve multmodal structures within pops, you can label a single individual to a higher level metepop. For example, if you think it can be merged into a very close pop that is different from other pops when running klfdapc, this can avoid removing single individual, after you get klfdapc features you can then plot it using true individual labels. This is one of the highlights of this method.

Cheers,

Xinghu

On Wed, Jun 15, 2022, 23:55 MarcGose @.***> wrote:

Thanks Xinghu, that was it!

Just aother short question: Given that requirement for the knn parameter, is there any way to incorporate a population that is respresented by only one individual?

— Reply to this email directly, view it on GitHub https://github.com/xinghuq/KLFDAPC/issues/5#issuecomment-1156648161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHTUDABALA3OU4WI5GFRIDVPH4FLANCNFSM5YJMEN5A . You are receiving this because you commented.Message ID: @.***>