modal-inria / MixtComp

Model-based clustering package for mixed data
Other
12 stars 4 forks source link

Error with only intervals for gaussian data #3

Open Quentin62 opened 3 years ago

Quentin62 commented 3 years ago

data.zip

library(RMixtComp)

dat = readRDS("bugdata.rds")
algo <- createAlgo(nInitPerClass = 1000)
model = list(molybdène = "Gaussian")

res <- mixtCompLearn(dat, model, algo, nClass = 1:3, nRun = 2, criterion = "ICL")

an R error is generated:

 Error in res[[indMax]] : 
  attempt to select less than one element in get1index 
3.
rmcMultiRun(algo, dataList, model, list(), nRun, nCore, verbose) at MIXTCOMP_mixtCompLearn.R#396
2.
classicLearn(data, model, algo, nClass, criterion, nRun, nCore, 
    verbose, mode) at MIXTCOMP_mixtCompLearn.R#283
1.
mixtCompLearn(datMC[datMC$Racine == "ELE100", ], model, algo, 
    nClass = 1:10, nRun = 10, criterion = "ICL")

the error comes from in RMixtCompIO:

  logLikelihood <- sapply(res, function(x) {ifelse(is.null(x$warnLog), x$mixture$lnObservedLikelihood, -Inf)})

  indMax <- which.max(logLikelihood)

  return(res[[indMax]])

If all warnlog are null then logLikelihood should be a vector of -Inf and this should not generate an error for which.max

Quentin62 commented 3 years ago

The error occurs when nClass = 1. The parameters are estimated but output criterion are:

$lnObservedLikelihood
[1] NaN
$lnCompletedLikelihood
[1] -Inf
$BIC
[1] NaN
$ICL
[1] -Inf

The problem comes from individuals 43 et 102, the observed loglikelihood is -inf

The parameters are:

k: 1, mean 0.05851658 
k: 1, sd   0.32023988

and the individuals 43 and 102 are "[2.99:3.01]" "[3.99:4.01]"

the computed probability is 0 and so the loglikelihood is -Inf

Quentin62 commented 3 years ago

idea: add an epsilon in the probabiblity computation function for some models (the ones that can't have a real 0: gaussian, weibull...)