mpt-network / MPTmultiverse

An R package for the comparison of MPT analysis approaches
3 stars 6 forks source link

Should we allow non-integer data? #11

Open singmann opened 6 years ago

singmann commented 6 years ago

Currently fit_mpt contains the following check (lines 74-82):

 # Check whether freqencies are integer ----
  not_integer <- unlist(lapply(X = data[, freq_cols], FUN = function(x) {
      any(as.integer(x)!=x)
    }
  ))

  if(any(not_integer)) {
    stop("Variable \"", paste(freq_cols[not_integer], collapse = ", "), "\" contains non-integer values.")
  }

I know that data should usually be integers, but at least MPTinR can work with non-integer values as well. For example, such values can occur if one wants to fit predicted probabilities or when using some correction for zero-cells. As long as TreeBUGS (@danheck ) not has a problem with non-integer data I suggest to downgrade the stop to a warning.

danheck commented 6 years ago

In TreeBUGS, the C++ samplers (simpleMPT, betaMPTcpp) use custom Gibbs samplers with conjugate posteriors and thus allow for non-integer data.

However, traitMPT and betaMPT require calling JAGS, which complains when one supplies non-integer data: Failed check for discrete-valued parameters in distribution dmulti

Since this warning and error output occurs at a very low-level part of the code, I would prefer to stick with the present check.

singmann commented 6 years ago

Hmm, I mean we could still make this check then conditional on using either of those methods, but I think this might not be super urgent. Nevertheless, we can keep the issue open.