Closed annennenne closed 5 years ago
In the complete case of the mwe (btw, thanks for providing one!), (S)EM is not supposed to be used with complete data, I'll add some checks.
As for the missing values case: there is indeed a bug, related to how continuous variables are treated. I will try to fix it in the coming weeks, as soon as I have the chance, but I can't make promises on when it will actually be ready.
In the meantime, one workaround that might work is to discretize the dataset in advance; sorry for that.
Thanks for the quick reply!
I thought it would have just used MMHC with no missing information, and I think this would be a nice default option.
I tried discretizing the data (minimal example below), but now I get a new error message:
Error in cliques[[parents.list[clique]]] :
attempt to select less than one element in get1index
Here's my example code:
#make discetized data
n <- 100
set.seed(123)
edata <- data.frame(Z = rnorm(n, mean = 10))
edata$X1 <- 0.5 * edata$Z + rnorm(n, mean = 0)
edata$X3 <- rnorm(n, mean = 15)
edata$X2 <- edata$X3 + rnorm(n, mean = 5)
edata$Y <- edata$X1 + edata$X2 + edata$X3 - edata$Z + rnorm(n, mean = 10)
edata_d <- as.data.frame(sapply(edata, function(x) as.numeric(cut(x, breaks = 6))))
edata_d_wm <- edata_d
set.seed(1234)
edata_d_wm$X1[sample(1:n, 10)] <- NA
edata_d_wm$X2[sample(1:n, 5)] <- NA
edata_d_wm$X3[sample(1:n, 20)] <- NA
#example of sem error (with missing information, only discrete variables)
bn_edata_d_wm <- BNDataset(edata_d_wm, discreteness = rep(TRUE, 5),
variables = names(edata_d_wm),
node.sizes = rep(6,5))
net_sem_edata_d_wm <- learn.network(bn_edata_d_wm, algo = "sem")
Yes, one round of MMHC is how it should be working in this case; the issue now is that it continues with the rest of the method, and that's the part not working.
The new error seems quite serious, and I don't have any idea for that. I will investigate.
Thanks!
Should be fixed now, fingers crossed. Thanks for finding this.
I am having problems with the
sem
option for thelearn.network()
function. It produces a fatal error:and a lot (>= 50) of repititions of this warning:
This happens both on data with and without missing information. I have provided a minimal example that produces the error below. Am I specifying something wrongly or is there a bug? Thanks in advance!