Closed eepackard closed 3 years ago
Hi @eepackard -- welcome to breakaway and GitHub! Let's start by figuring out what's going wrong with running betta
. Could you please show me the output of the following?
table(betta.df.min$Severity)
table(betta.df.min$Sample_Date)
It seems that your model is overparametrized - I think you have as many datapoints as covariates in your model. This isn't a issue with betta
, but we can do our best to help troubleshoot. eg., You may have meant to have date/severity be a continuous variable, and they may be categorical with a different category for each observation. I recommend that you write down the model that you want to fit, and making sure that the model that you want to fit aligns with the one that you are actually giving to betta
Thank you for such a quick response!
Here are the following outputs:
> table(betta.df.min$Severity)
lp-severe med-severe medium severe surface unburnt
8 12 12 16 8 16
> table(betta.df.min$Sample_Date)
june may oct sept
20 20 16 16
I am trying to fit a model with estimated richness as the response variable and severity and sample date as fixed factors with an interaction. Maybe I am not correctly inputing my data for what I am intending to achieve?
Interesting -- I'm surprised that you're getting an error. Does this work?
bm <- breakaway::betta(chats = betta.df.min$Estimate, ses = betta.df.min$Error, X = model.matrix(~Severity - 1, data = betta.df.min))
and similarly for X = model.matrix(~Severity*Sample_Date - 1, data = betta.df.min)
?
I tried that first line of code you suggested after reading the discussion from #85 but still got a similar results.
> bm <- breakaway::betta(chats = betta.df.min$Estimate,
+ ses = betta.df.min$Error,
+ X = model.matrix(~Severity-1, data = betta.df.min))
Error in solve.default(R) :
Lapack routine dgesv: system is exactly singular: U[24,24] = 0
Likewise when I try with X = model.matrix(~Severity*Sample_Date - 1, data = betta.df.min)
. It just has a different "reciprocal condition number" in the error message
> bm <- breakaway::betta(chats = betta.df.min$Estimate,
+ ses = betta.df.min$Error,
+ X = model.matrix(~Severity*Sample_Date-1, data = betta.df.min))
Error in solve.default(t(X_effective) %*% X_effective) :
system is computationally singular: reciprocal condition number = 1.23894e-19
I had an epiphany -- this should work, I think
my_x <- model.matrix(~Severity, data = betta.df.min)[, -1]
bm <- breakaway::betta(chats = betta.df.min$Estimate, ses = betta.df.min$Error, X = my_x)
betta
will add an intercept by default, so you can drop it from your covariate matrix.
Sorry for the trouble and fingers crossed this fixes it!
Sadly, still not working
my.x<-model.matrix(~Severity*Sample_Date,data = betta.df.min)[,-1]
> bm <- breakaway::betta(chats = betta.df.min$Estimate,
+ ses = betta.df.min$Error,
+ X = my.x)
Error in solve.default(t(X_effective) %*% X_effective) :
system is computationally singular: reciprocal condition number = 1.64234e-19
my.x<-model.matrix(~Severity,data = betta.df.min)[,-1]
> bm <- breakaway::betta(chats = betta.df.min$Estimate,
+ ses = betta.df.min$Error,
+ X = my.x)
Error in solve.default(R) :
Lapack routine dgesv: system is exactly singular: U[24,24] = 0
This is what the matrix looks like for my.x<-model.matrix(~Severity,data = betta.df.min)[,-1]
. So I suppose it is not the intercept causing the issue?
> head(my.x)
Severitymed-severe Severitymedium Severitysevere Severitysurface Severityunburnt
M10E 0 0 1 0 0
M10G 0 0 1 0 0
M11E 0 0 1 0 0
M11G 0 0 1 0 0
M12E 1 0 0 0 0
M12G 1 0 0 0 0
and also a sample of what my data frame betta.df.min
looks like:
> head(betta.df.min)
Estimate Error Severity Sample_Date
M10E 64.46422 0.2725754 severe may
M10G 60.34763 0.2255343 severe june
M11E 63.82257 0.6754035 severe may
M11G 24.23992 0.4901958 severe june
M12E 81.63864 0.3182650 med-severe may
M12G 108.06674 73.0194010 med-severe june
> str(betta.df.min)
'data.frame': 72 obs. of 4 variables:
$ Estimate : num 64.5 60.3 63.8 24.2 81.6 ...
$ Error : num 0.273 0.226 0.675 0.49 0.318 ...
$ Severity : chr "severe" "severe" "severe" "severe" ...
$ Sample_Date: chr "may" "june" "may" "june" ...
Thank you for patience and help!
This is really strange - most likely your design matrix is not full rank but there's nothing suggesting this as the issue from the other diagnostics. If you could email me the file genereated by saveRDS(betta.df.min, "breakaway-issue-108.RDS")
I will play around with it on my machine and get back to you.
My attempt at recreating the issue wasn't successful (below), so unfortunately I don't think I can troubleshoot further without having the data myself 😞
n <- 20*2+16*2
est <- rnorm(n, 100)
sd <- rnorm(n, 10)
Severity <- rep(c("lp-severe", "med-severe", "medium", "severe", "surface", "unburnt"), n/6)
date <- rep(c("may", "june", "oct", "sept"), each = n/4)
my.x <- model.matrix(~Severity*date)
breakaway::betta(chats = est,
ses = sd,
X = my.x)
Fantastic, thanks for emailing me the needed data, @eepackard. Indeed I can reproduce the error.
issue <- readRDS("[emailed-file]")
library(breakaway)
library(magrittr)
library(tidyverse)
issue %>%
group_by(Severity, Sample_Date) %>%
summarise(n = n()) %>%
arrange(desc(n)) %>%
print(n=Inf)
We have 6 different severities (issue$Severity %>% unique %>% length
) and 4 different sample dates,
so 6 x 4 = 24 levels in a full interactive model. However, we only have 22 of these levels represented
(looks like no lp-severe
's in October or September). As a result, there's no data for betta
to estimate for these combination of levels, which is specified in a fully crossed model.
So, some of your options include fitting an additive model (no interaction between Severity and Sample Date), or fitting a reduced interactive model -- basically just estimating parameters for all combinations of levels that exist in the dataset (since there's at least 2 observations per level for all levels that are present).
If the purpose is to investigate the effect of severity (and sample date is thought of as a blocking variable such as a batch effect), I would probably recommend the additive model (though thorough consulting is unfortunately beyond the scope of this forum).
I'm happy to advise on how to do the latter if that's what you want -- let me know :)
Hi @adw96,
Thank you for looking into this issue further. I understand why that wouldn't work for the crossed model.
I still had issues running betta with an additive model. I get the same error message when using an additive model or even just running one variable separately.
my.x<-model.matrix(~Severity+Sample_Date,data = betta.df.min)
head(my.x)
(Intercept) SeverityLow SeverityMedium SeverityMedium-Severe SeveritySevere SeverityLP-Severe
M10E 1 0 0 0 1 0
M10G 1 0 0 0 1 0
M11E 1 0 0 0 1 0
M11G 1 0 0 0 1 0
M12E 1 0 0 1 0 0
M12G 1 0 0 1 0 0
Sample_DateOctober Sample_DateMay Sample_DateJune
M10E 0 1 0
M10G 0 0 1
M11E 0 1 0
M11G 0 0 1
M12E 0 1 0
M12G 0 0 1
However, I think the later might be a better option for my question anyways. Could you please advise me on how to do the later? I would be very interested to know.
Thank you again.
Erica :)
Ah, I see that you will also have to remove a sample that had zero error in it. Here's a fix for the additive model:
new_x <- issue %>%
select(Severity, Sample_Date) %>%
tibble %>%
mutate(across(everything(), as.character)) %>%
model.matrix(~ Severity + Sample_Date, data = .)
zero_error <- which(issue$Error == 0)
betta(chats=issue$Estimate[-zero_error], ses = issue$Error[-zero_error],
X = new_x[-zero_error, ]) %$%
table
and for the interactive model with missing levels
drop_missing_levels <- issue %>%
select(Severity, Sample_Date) %>%
mutate(combos = paste(Severity, Sample_Date, sep="--")) %>%
select(combos) %>%
model.matrix(~combos - 1, data=.)
colnames(drop_missing_levels) <- str_remove(colnames(drop_missing_levels), "combos")
betta(chats = issue$Estimate[-zero_error],
ses = issue$Error[-zero_error],
X = drop_missing_levels[-zero_error, ]) %$%
table
Note to self: We should have a warning for a zero error or non-full-rank design matrix!
Hello,
I am receiving an error when testing my hypothesis with betta(). A similar issue was discussed in #85 but I wasn't clear about how it was resolved.
Here is what I was trying to test:
bm <- breakaway::betta(chats = betta.df.min$Estimate, ses = betta.df.min$Error, X = model.matrix(~Severity*Sample_Date, data = betta.df.min)) Error in solve.default(t(X_effective) %*% X_effective) : system is computationally singular: reciprocal condition number = 9.07297e-20
My model.matrix is quite large as Severity is a factor with 6 levels and Sample_Date is a factor with 4 levels. Is it just that my model is overparameterized? I tried to run with only Severity but it provides a different error instead:bm <- breakaway::betta(chats = betta.df.min$Estimate, ses = betta.df.min$Error, X = model.matrix(~Severity, data = betta.df.min)) Error in solve.default(R) : Lapack routine dgesv: system is exactly singular: U[24,24] = 0
Will the function still work if the assumption of normality is violated? And if normality is violated is there an appropriate way to transform the estimated richness and its uncertainty?
Any help would be appreciated. This is my first time utilizing GitHub to discuss a coding issue. :) Please let me know if there is additional information needed for troubleshooting.
Session info