Closed earosenthal closed 2 years ago
I was not able to reproduce either of those errors. Can you try again using current versions of R, GENESIS, and associated packages (particularly data.table and Matrix)?
I have the updated versions and I am still running into the same problem. I am currently running on a shared linux machine. I will try on my local computer and see if I run into similiar issues.
I've tested it using Rstudio with R/4.1.0, GENESIS_2.24.0 Matrix_1.4-0 data.table_1.14.2 and I get the same results, see below. Any suggestion on what else might be going on?
Here is the code I am using and the different output I get:
#setup
ibrary(data.table)
library(Matrix)
library(GENESIS)
sessionInfo()
id1 <- id2 <- seq(1:30)
id1 <- c(id1,29)
id2 <- c(id2,30)
kinship <- c(rep(1,30),0.232122593288146)
Trial 1
kin.dat <- as.data.frame(cbind(id1,id2,kinship))
colnames(kin.dat) <- c("ID1","ID2","value")
kin.mat.gen.sparse <- makeSparseMatrix(kin.dat,thresh=NULL)
results in
Using 30 samples provided
Identifying clusters of relatives...
2 relatives in 1 clusters; largest cluster = 2
Creating block matrices for clusters...
Error in bmerge(i, x, leftcols, rightcols, roll, rollends, nomatch, mult, :
Incompatible join types: x.ID1 (double) and i.ID1 (character)
Trial 2, specify that IDs are characters
kin.dat <- as.data.frame(cbind(as.character(id1),as.character(id2),kinship))
colnames(kin.dat) <- c("ID1","ID2","value")
kin.mat.gen.sparse <- makeSparseMatrix(kin.dat,thresh=NULL)
results in
Using 30 samples provided
Identifying clusters of relatives...
2 relatives in 1 clusters; largest cluster = 2
Creating block matrices for clusters...
Error in submat + t(submat) : non-numeric argument to binary operator
Trial 3 specify that IDs are characters and values are numeric
kin.dat <- as.data.frame(cbind(as.character(id1),as.character(id2),
as.numeric(kinship)))
colnames(kin.dat) <- c("ID1","ID2","value")
kin.mat.gen.sparse <- makeSparseMatrix(kin.dat,thresh=NULL)
results in
Using 30 samples provided
Identifying clusters of relatives...
2 relatives in 1 clusters; largest cluster = 2
Creating block matrices for clusters...
Error in submat + t(submat) : non-numeric argument to binary operator
I think I solved it. Instead of supplying a data.frame to the function, I can supply a data.table, and make sure the columns are of the correct types:
kkin.dt <- data.table(as.data.frame(cbind(as.character(id1),
as.character(id2),
as.numeric(kinship))))
setnames(kin.dt,c("ID1","ID2","value"))
kin.dt[,value:=as.numeric(value)]
kin.mat.gen.sparse <- makeSparseMatrix(kin.dt,thresh=NULL)
Thanks for the reproducible example! In your Trials 2 and 3, the error is because cbind
creates a matrix with the data type of its first argument, so when you convert it to a data.frame, value
ends up being a character vector:
> kin.mat <- cbind(as.character(id1),as.character(id2), as.numeric(kinship))
> class(kin.mat)
[1] "matrix" "array"
> mode(kin.mat)
[1] "character"
> kin.dat <- as.data.frame(cbind(as.character(id1),as.character(id2), as.numeric(kinship)))
> colnames(kin.dat) <- c("ID1","ID2","value")
> lapply(kin.dat, class)
$ID1
[1] "character"
$ID2
[1] "character"
$value
[1] "character"
This should work:
kin.dat <- data.frame(ID1=as.character(id1), ID2=as.character(id2), value=kinship)
makeSparseMatrix(kin.dat)
Trial 1 is in fact a bug, and I think the fix is for the code to coerce ID1 and ID2 to character if they are supplied as numeric.
I'm using R version 4.0.4 with GENESIS_2.20.1. I am trying to make sure I can make a sparse kinship matrix for ~87K participants. I am starting with 30 participants, where only two are actually related. I am using the call
makeSparseMatrix(kin.dat,thresh=NULL)
where kin.dat is a data.frame containing the columns ID1, ID2 and value. Below are two of the things I tried and my diagnosis. My diagnosis may be incorrect, but I am hoping we can resolve the issue. I include a sample of what the input loolks like at the bottom. In addition, I get the following warning when I load the GENESIS library, which might be pertinent:Error in bmerge(i, x, leftcols, rightcols, roll, rollends, nomatch, mult, : Incompatible join types: x.ID1 (double) and i.ID1 (character) Calls: makeSparseMatrix ... .makeSparseMatrix_df -> [ -> [.data.table -> bmerge Execution halted
ID1 <- as.character(ID1) ID2 <- as.character(ID2) values <- as.numeric(value)