JenniNiku / gllvm

Generalized Linear Latent Variable Models
https://jenniniku.github.io/gllvm/
49 stars 20 forks source link

Fourth corner model: "Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric" #27

Open wilkesma opened 3 years ago

wilkesma commented 3 years ago

Thank you for a useful package. It promises to offer the gains in efficiency we've been looking for.

I have successfully fitted a probit model without traits, but when I try the 4th corner model I get "Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric"

My traits (body length in 3 bins) are binary and I have ensured that they are stored in a dataframe as numeric variables.

traits <- as.data.frame(apply(traits, 2, function(x) as.numeric(x)))
X.data <- data.frame(basin=as.character(meta$basin), time=as.numeric(meta$days))
mod <- gllvm(y=taxa, X=X.data, formula= ~basin*time, family=binomial(link = "probit")) #Works fine
mod.tr <- gllvm(y=taxa, X=X.data, TR=traits,formula= y ~ (basin*time) + (basin*time):(len.sma + len.med + len.lar), family=binomial(link = "probit")) #Produces error

Any advice on what might be causing the issue?

BertvanderVeen commented 3 years ago

Great that you're trying to use the package! Would you mind providing a reproducible example, possibly with some simulated data? Without a reproducible example it's more difficult to find out what's going on.

gerverska commented 5 months ago

Hi all,

I've actually seen this issue with my own binomial('probit') adventures, but I've had issues recreating this with spider data. I get the same error with both the CRAN version, and on the dev version I get a elf_dynamic_array_reader.h(64) tag not found error that crashes R.

Interestingly, when fitting negative binomial models to count response data (allowing NAs in the responses), I would get the same Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric with the CRAN version. Switching to the dev version completely fixed the issue here, though.

I honestly think some of us might just need to share obscured versions of our datasets to really figure out what's going on, as it's not apparent to me what properties of my data could be simulated to reproduce the error.

BertvanderVeen commented 5 months ago

Does changing starting.valto "zero" solve your issues?

gerverska commented 5 months ago

Sorry for the delay, Bert! For more detail, this model in question doesn't have any latent variables--does starting.val still affect non-LV models?

BertvanderVeen commented 5 months ago

Sorry for the delay, Bert! For more detail, this model in question doesn't have any latent variables--does starting.val still affect non-LV models?

Yes.

gerverska commented 4 months ago

Hi Bert--here's my delayed update, in short, it doesn't seem that changing the starting value has an effect. I've tried both 'zero' and 'random', but both still give elf_dynamic_array_reader.h(64) tag not found errors that lead R to abort.

Here's the model I tried:

test <- gllvm(Y,
                    X,
                    TR,
                    formula = ~ cat_6 + cat_2 + log(num),
                    family = binomial('probit'),
                    num.lv = 0,
                    gradient.check = T,
                    control = list(reltol = 1e-16),
                    control.start = list(starting.val = 'zero',
                                         n.init = 5))

Where cat_6 is a six-level categorical variable, cat_2 has two levels, and num is a numeric variable. TR is a spoofed matrix as suggested by @tanharri in #109 to get the community-wide response--perhaps something's breaking down on this end? The same model works fine when the spoofed TR is removed.

As mentioned previously, my negative binomial model can handle the spoofed matrix approach just fine (same formula, minus the numeric variable), but the dev version is needed for this.

BertvanderVeen commented 4 months ago

Hi Bert--here's my delayed update, in short, it doesn't seem that changing the starting value has an effect. I've tried both 'zero' and 'random', but both still give elf_dynamic_array_reader.h(64) tag not found errors that lead R to abort.

Here's the model I tried:

test <- gllvm(Y,
                    X,
                    TR,
                    formula = ~ cat_6 + cat_2 + log(num),
                    family = binomial('probit'),
                    num.lv = 0,
                    gradient.check = T,
                    control = list(reltol = 1e-16),
                    control.start = list(starting.val = 'zero',
                                         n.init = 5))

Where cat_6 is a six-level categorical variable, cat_2 has two levels, and num is a numeric variable. TR is a spoofed matrix as suggested by @tanharri in #109 to get the community-wide response--perhaps something's breaking down on this end? The same model works fine when the spoofed TR is removed.

As mentioned previously, my negative binomial model can handle the spoofed matrix approach just fine (same formula, minus the numeric variable), but the dev version is needed for this.

Note that using "n.init" with "starting.val=zero"" is pointless since it will give the same result on every fit.

Sorry, I cannot draw any conclusions from this. My suspicion is that the problem is highly dependent on the dataset in question. Even when I take the same approach; a different dataset but with a categorical variable with two levels and a numerical variable, I still cannot reproduce the issue. So, it will require a fully reproducible example, with simulations or a dataset, for me to look into this.