ANOVA issues - Githubissues

jdebeer2005 commented 4 years ago

JASP version: 0.12.2
OS name and version: Win 10
Analysis: Descriptives Descriptives plots Assumption Checks Q-Q Plot Post Hoc Tests Standard This analysis terminated unexpectedly.

Error in ref_grid(object, ...): Can't handle an object of class 'NULL' Use help('models', package = 'emmeans') for information on supported models.

Stack trace withCallingHandlers(expr = analysis(jaspResults = jaspResults, dataset = dataset, options = options), error = .addStackTrace)

analysis(jaspResults = jaspResults, dataset = dataset, options = options)

Ancova(jaspResults, dataset = NULL, options)

.anovaPostHocTableCollection(anovaContainer, dataset, options, ready)

.anovaPostHocTable(postHocContainer, dataset, options, anovaContainer[['model']]$object)

emmeans::lsmeans(model, postHocVariablesListV)

.emwrap(emmeans, subst = 'ls', ...)

emmfcn(...)

ref_grid(object, ...)

stop(data)

To receive assistance with this problem, please report the message above at: https://jasp-stats.org/bug-reports

Bug description:
Expected behaviour: Cannot do a two factor ANOVA

Steps to reproduce:

Go to '...' Run ANOVA Dependent variable - precooker loss Independ - Precooker, and Species

This works on other files

jdebeer2005@gmail.com

Click on '....'
Scroll down to '....'
See error

Eurofish - 2020 -Mar.zip

jdebeer2005 commented 4 years ago

I know the answer. I had 3 species of fish, and 7 precookers, with like 10,000 line of data for each data set. I ran a contingency table of Precookers versus fish. and in one cell I was missing any data. I took some other data and modified it so I had data for that cell, and the analysis worked. So JASP apparently cannot handle missing data, in a computed ?? cell.

Kucharssim commented 4 years ago

Hi @jdebeer2005,

thank you for your report and enclosed file that reproduces your issue.

Indeed, what you are dealing with is a 3x7 ANOVA with one of the 21 design cells having zero observations. I am afraid you cannot use this analysis for this data, as ANOVA cannot deal with such a design - this is not a bug in our software but a problem of not having unique solutions for the underlying calculations.

However, it is clear that JASP should report an informative error message that would reveal this problem immidiately, instead of wasting your time by investigating through contingency tables what is the issue.

@JohnnyDoorn I can add more informative error message, but I am not sure where to catch this the most efficiently. Let me know if you are willing to do it yourself, as it will probably take less time to figure it out :)

jdebeer2005 commented 4 years ago

Thank you sir. Very helpful.

So the next question, will this data set work with a General Linear Model. That will essentially give me the same answer.

Thanks so much.

JDB

From: Simon Kucharsky notifications@github.com Sent: Monday, May 4, 2020 3:36 AM To: jasp-stats/jasp-issues jasp-issues@noreply.github.com Cc: jdebeer2005 jdebeer2005@gmail.com; Mention mention@noreply.github.com Subject: Re: [jasp-stats/jasp-issues] ANOVA issues (#725)

Hi @jdebeer2005 https://github.com/jdebeer2005 ,

thank you for your report and enclosed file that reproduces your issue.

Indeed, what you are dealing with is a 3x7 ANOVA with one of the 21 design cells having zero observations. I am afraid you cannot use this analysis for this data, as ANOVA cannot deal with such a design - this is not a bug in our software but a problem of not having unique solutions for the underlying calculations.

However, it is clear that JASP should report an informative error message that would reveal this problem immidiately, instead of wasting your time by investigating through contingency tables what is the issue.

@JohnnyDoorn https://github.com/JohnnyDoorn I can add more informative error message, but I am not sure where to catch this the most efficiently. Let me know if you are willing to do it yourself, as it will probably take less time to figure it out :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jasp-stats/jasp-issues/issues/725#issuecomment-623387708 , or unsubscribe https://github.com/notifications/unsubscribe-auth/APNLBZZYVXW75YOM6LAYJKLRP2LAVANCNFSM4MXLL4AA . https://github.com/notifications/beacon/APNLBZ24IZG3OGKTZ3WG72DRP2LAVA5CNFSM4MXLL4AKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEUUCIPA.gif

tomtomme commented 10 months ago

@Kucharssim & @JohnnyDoorn Loading that data file needed 5 minutes... but I still can reproduce the uninformative error message with jasp 0.19 beta

@jdebeer2005 Using the GLM (the linear regression module) worked fine in my short test. The precooker 2V variable with n=0 is then shown with p=1 as expected. Also adding the interaction between precooker and species works fine.

JohnnyDoorn commented 8 months ago

@tomtomme can you share your jasp file with the error? For me the original jasp file (also after reloading) works fine.

tomtomme commented 8 months ago

I used the original file. I then put in the vars like described in the first post. But now with the newest 0.19 beta there is an informative error message! So I consider this fixed :D

@jdebeer2005 as always, please reopen if not fixed on your side!

JohnnyDoorn commented 8 months ago

I see now - the error is too strict though, and the analysis just needs to be rerun with a different type of SS because of 1 empty cell in the design. Im looking into making it more lenient now.

JohnnyDoorn commented 8 months ago

FYI @jdebeer2005 - The issue here is an empty cell in the interaction (2V - BE), so removing the interaction effect, or adding an observation for that cell, is a workaround to the issue. (and sorry for the big delay, this probably comes way too late..)

jdebeer2005 commented 8 months ago

Thanks very much

Sent from my iPhone

On Feb 20, 2024, at 7:02 AM, Johnny van Doorn @.***> wrote:

FYI @jdebeer2005 - The issue here is an empty cell in the interaction (2V - BE), so removing the interaction effect, or adding an observation for that cell, is a workaround to the issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

tomtomme commented 6 months ago

Tested with current 0.19 beta. Still an issue, when interaction term is not removed:

This analysis terminated unexpectedly.

Error in (function (object, at, cov.reduce = mean, cov.keep = get_emm_option('cov.keep'), : Can't handle an object of class  “NULL”
Use help('models', package = 'emmeans') for information on supported models.

Stack trace
analysis(jaspResults = jaspResults, dataset = dataset, options = options)

AncovaInternal(jaspResults, dataset = NULL, options)

.anovaPostHocTableCollection(anovaContainer, dataset, options, ready)

.anovaPostHocTable(postHocContainer, dataset, options, anovaContainer[['model']]$object)

emmeans::lsmeans(model, postHocVariablesListV)

.emwrap(emmeans, subst = 'ls', ...)

emmfcn(...)

do.call(ref_grid, args)

(function (object, at, cov.reduce = mean, cov.keep = get_emm_option('cov.keep'), mult.names, mult.levs, options = get_emm_option('ref_grid'), data, df, type, regrid, nesting, offset, sigma, counterfactuals, wt.counter, avg.counter = TRUE, nuisance = character(0), non.nuisance, wt.nuis = 'equal', rg.limit = get_emm_option('rg.limit'), ...)
{
    .foo = function(t, tr, tra, tran, transform = NULL, ...) transform
    .bar = .foo(...)
    if (!is.null(.bar)) {
        regrid = .bar
        message('In 'ref_grid()', use 'regrid = ...' rather than 'transform = ...' ', 'to avoid this message.')
    }
    if (!missing(df)) {
        if (is.null(options))
            options = list()
        options$df = df
    }
    if (missing(data)) {
        data = try(recover_data(object, data = NULL, ...))
        if (inherits(data, 'try-error'))
            stop('Perhaps a 'data' or 'params' argument is needed')
    }
    else if (is.null(options$delts))
        data = recover_data(object, data = as.data.frame(data), ...)
    if (is.character(data))
        stop(data)
    if (!is.null(options$just.data))
        return(data)
    trms = attr(data, 'terms')
    coerced = .find.coerced(trms, data)
    sort.unique = function(x) sort(unique(x))
    if (is.null(cov.keep))
        cov.keep = character(0)
    cov.thresh = max(c(1, suppressWarnings(as.integer(cov.keep))), na.rm = TRUE)
    if (is.logical(cov.reduce)) {
        if (!cov.reduce)
            cov.thresh = 99
        cov.reduce = mean
    }
    dep.x = list()
    fix.cr = function(cvr) {
        if (inherits(cvr, 'formula')) {
            if (length(cvr) < 3)
                stop('Formulas in 'cov.reduce' must be two-sided')
            lhs = .all.vars(cvr)[1]
            dep.x[[lhs]] <<- cvr
            cvr = mean
        }
        else if (!inherits(cvr, c('function', 'list')))
            stop('Invalid 'cov.reduce' argument')
        cvr
    }
    if (is.list(cov.reduce))
        cov.reduce = lapply(cov.reduce, fix.cr)
    else cov.reduce = fix.cr(cov.reduce)
    if (!missing(at))
        for (xnm in names(at)) dep.x[[xnm]] = NULL
    cr = function(x, nm) {
        if (is.function(cov.reduce))
            cov.reduce(x)
        else if (hasName(cov.reduce, nm))
            cov.reduce[[nm]](x)
        else mean(x)
    }
    ref.levels = matlevs = xlev = chrlev = list()
    for (nm in attr(data, 'responses')) {
        y = data[[nm]]
        if (is.matrix(y))
            matlevs[[nm]] = apply(y, 2, mean)
        else ref.levels[[nm]] = mean(y)
    }
    for (nm in attr(data, 'predictors')) {
        x = data[[nm]]
        if (is.matrix(x) && ncol(x) = 1)
            x = as.numeric(x)
        if (is.factor(x) && !(nm %in% coerced$covariates))
            xlev[[nm]] = levels(factor(x))
        else if (is.character(x))
            xlev[[nm]] = sort(unique(x))
        if (!(nm %in% coerced$factors) && !missing(at) && (hasName(at, nm)))
            ref.levels[[nm]] = at[[nm]]
        else if (is.factor(x) && !(nm %in% coerced$covariates))
            ref.levels[[nm]] = levels(factor(x))
        else if (is.character(x) || is.logical(x))
            ref.levels[[nm]] = chrlev[[nm]] = sort.unique(x)
        else if (is.matrix(x)) {
            matlevs[[nm]] = apply(x, 2, cr, nm)
            if (is.matrix(matlevs[[nm]]))
                matlevs[[nm]] = apply(matlevs[[nm]], 2, mean)
        }
        else {
            if (nm %in% coerced$factors)
                ref.levels[[nm]] = sort.unique(x)
            else {
                if ((length(uval <- sort.unique(x)) > cov.thresh) && !(nm %in% cov.keep))
                  ref.levels[[nm]] = cr(as.numeric(x), nm)
                else {
                  ref.levels[[nm]] = uval
                  cov.keep = c(cov.keep, nm)
                }
            }
        }
    }
    if (!missing(non.nuisance))
        nuisance = setdiff(names(ref.levels), non.nuisance)
    if (no.nuis <- (length(nuisance) = 0)) {
        if (!missing(counterfactuals)) {
            cfac = intersect(counterfactuals, names(ref.levels))
            ref.levels = ref.levels[cfac]
            ref.levels$.obs.no. = seq_len(nrow(data))
            .check.grid(ref.levels, rg.limit)
            grid = .setup.cf(ref.levels, data)
        }
        else {
            .check.grid(ref.levels, rg.limit)
            grid = do.call(expand.grid, ref.levels)
        }
    }
    else {
        nuis.info = .setup.nuis(nuisance, ref.levels, trms, rg.limit)
        grid = nuis.info$grid
    }
    if (!is.null(delts <- options$delts)) {
        var = options$var
        n.orig = nrow(grid)
        grid = grid[rep(seq_len(n.orig), length(delts)), , drop = FALSE]
        options$var = options$delts = NULL
    }
    for (nm in names(matlevs)) {
        tmp = matrix(rep(matlevs[[nm]], each = nrow(grid)), nrow = nrow(grid))
        dimnames(tmp) = list(NULL, names(matlevs[[nm]]))
        grid[[nm]] = tmp
    }
    for (xnm in names(dep.x)) {
        if ((xnm %in% c('ext', 'extern', 'external')) && !(xnm %in% names(grid))) {
            fun = get(as.character(dep.x[[xnm]][[3]]), inherits = TRUE)
            rslts = fun(grid)
            for (nm in intersect(names(rslts), names(grid))) {
                grid[[nm]] = rslts[[nm]]
                ref.levels[[nm]] = NULL
            }
        }
        else if (!all(.all.vars(dep.x[[xnm]]) %in% names(grid)))
            stop('Formulas in 'cov.reduce' must predict covariates actually in the model')
        else {
            xmod = lm(dep.x[[xnm]], data = data)
            grid[[xnm]] = predict(xmod, newdata = grid)
            ref.levels[[xnm]] = NULL
        }
    }
    if (!is.null(delts))
        grid[[var]] = grid[[var]] + rep(delts, each = n.orig)
    if (!is.null(attr(data, 'pass.it.on')))
        attr(object, 'data') = data
    xl = xlev
    modnm = rownames(attr(trms, 'factors'))
    chk = sapply(modnm, function(mn) mn %in% names(xl))
    for (i in which(!chk)) {
        fn = all.vars(reformulate(modnm[i]))
        if (length(fn) = 1)
            names(xl)[names(xl) = fn] = modnm[i]
    }
    basis = emm_basis(object, trms, xl, grid, misc = attr(data, 'misc'), options = options, ...)
    environment(basis$dffun) = baseenv()
    if (length(basis$bhat) ≠ ncol(basis$X))
        stop('Something went wrong:n', ' Non-conformable elements in reference grid.', call. = TRUE)
    collapse = NULL
    if (!missing(counterfactuals)) {
        grid = do.call(expand.grid, ref.levels)
        if (missing(regrid))
            regrid = 'response'
        if (avg.counter)
            collapse = '.obs.no.'
    }
    if (!no.nuis) {
        basis = .basis.nuis(basis, nuis.info, wt.nuis, ref.levels, data, grid, ref.levels)
        grid = basis$grid
        nuisance = ref.levels[nuis.info$nuis]
        ref.levels = basis$ref.levels
    }
    misc = basis$misc
    frm = try(formula(eval(attr(data, 'call')[[2]])), silent = TRUE)
    if (inherits(frm, 'formula')) {
        lhs = if (length(frm) = 2)
            NULL
        else frm[-3]
        tran = setdiff(.all.vars(lhs, functions = TRUE), c(.all.vars(lhs), '~', 'cbind', '+', '-', '*', '/', '^', '%%', '%/%'))
        if (length(tran) > 0) {
            if (tran[1] %in% c('scale', 'center', 'centre', 'standardize', 'standardise')) {
                pv = try(attr(terms(object), 'predvars'), silent = TRUE)
                if (!inherits(pv, 'try-error') && !is.null(pv)) {
                  scal = which(sapply(c(sapply(pv, as.character), 'foo'), function(x) x[1]) = tran[1])
                  if (length(scal) > 0) {
                    pv = pv[[scal[1]]]
                    ctr = ifelse(is.null(pv$center), 0, ifelse(pv$center, pv$center, 0))
                    scl = ifelse(is.null(pv$scale), 1, ifelse(pv$scale, pv$scale, 1))
                    tran = make.tran('scale', y = 0, center = ctr, scale = scl)
                  }
                }
                if (is.character(tran)) {
                  tran = NULL
                  message('NOTE: Unable to recover scale() parameters. See '? make.tran'')
                }
            }
            else if (tran[1] = 'linkfun')
                tran = as.list(environment(trms))[c('linkfun', 'linkinv', 'mu.eta', 'valideta', 'name')]
            else {
                if (tran[1] = 'I')
                  tran = 'identity'
                tran = paste(tran, collapse = '.')
                const.warn = 'There are unevaluated constants in the response formulanAuto-detection of the response transformation may be incorrect'
                tst = strsplit(strsplit(as.character(lhs[2]), '(')[[1]][1], '*')[[1]]
                if (length(tst) > 1) {
                  mul = try(eval(parse(text = tst[1])), silent = TRUE)
                  if (!inherits(mul, 'try-error')) {
                    misc$tran.mult = mul
                    tran = gsub('*.', '', tran)
                  }
                  else warning(const.warn)
                }
                tst = strsplit(as.character(lhs[2]), '(|)|+')[[1]]
                if (length(tst) > 2) {
                  const = try(eval(parse(text = tst[3])), silent = TRUE)
                  if (!inherits(const, 'try-error') && (length(tst) = 3))
                    misc$tran.offset = const
                  else warning(const.warn)
                }
            }
            if (is.null(misc[['tran']]))
                misc$tran = tran
            else misc$tran2 = tran
            misc$inv.lbl = 'response'
        }
    }
    multresp = character(0)
    ylevs = misc$ylevs
    if (!is.null(ylevs)) {
        if (missing(mult.levs))
            mult.levs = ylevs
        if (!missing(mult.names)) {
            k = seq_len(min(length(ylevs), length(mult.names)))
            names(mult.levs)[k] = mult.names[k]
        }
        if (length(ylevs) > 1)
            ylevs = list(seq_len(prod(sapply(mult.levs, length))))
        k = prod(sapply(mult.levs, length))
        if (k ≠ length(ylevs[[1]]))
            stop('supplied 'mult.levs' is of different length ', 'than that of multivariate response')
        for (nm in names(mult.levs)) ref.levels[[nm]] = mult.levs[[nm]]
        multresp = names(mult.levs)
        MF = do.call('expand.grid', mult.levs)
        grid = merge(grid, MF)
    }
    for (nm in names(matlevs)) grid[[nm]] = matrix(rep(matlevs[[nm]], each = nrow(grid)), nrow = nrow(grid))
    problems = if (!missing(at))
        intersect(c(multresp, coerced$factors), names(at))
    else character(0)
    if (length(problems) > 0) {
        incl.flags = rep(TRUE, nrow(grid))
        for (nm in problems) {
            if (is.numeric(ref.levels[[nm]])) {
                dig = 3 - log10(max(abs(ref.levels[[nm]])))
                at[[nm]] = round(at[[nm]], digits = dig)
                ref.levels[[nm]] = round(ref.levels[[nm]], digits = dig)
                grid[[nm]] = round(grid[[nm]], digits = dig)
            }
            at[[nm]] = ref.levels[[nm]] = at[[nm]][at[[nm]] %in% ref.levels[[nm]]]
            rows = numeric(0)
            for (x in at[[nm]]) rows = c(rows, which(grid[[nm]] = x))
            grid = grid[rows, , drop = FALSE]
            grid[[nm]] = factor(grid[[nm]], levels = at[[nm]])
            basis$X = basis$X[rows, , drop = FALSE]
        }
    }
    om = ifelse(is.null(misc$offset.mult), 1, misc$offset.mult)
    oval = 0
    if (!missing(offset)) {
        if (offset[1] ≠ 0)
            oval = offset[1]
    }
    else {
        if ('.static.offset.' %in% names(grid)) {
            oval = om * grid[['.static.offset.']]
        }
        if (!is.null(attr(trms, 'offset'))) {
            if (any(om ≠ 0))
                oval = om * (oval + .get.offset(trms, grid))
        }
        if (any(oval ≠ 0))
            grid[['.offset.']] = oval
    }
    if (!hasName(data, '(weights)'))
        data[['(weights)']] = 1
    cov.keep = intersect(unique(cov.keep), names(ref.levels))
    nms = union(union(union(names(xlev), names(chrlev)), coerced$factors), cov.keep)
    nms = intersect(nms, names(grid))
    if (length(nms) = 0)
        wgt = rep(1, nrow(grid))
    else {
        id = .my.id(data[, nms, drop = FALSE])
        uid = !duplicated(id)
        key = do.call(paste, unname(data[uid, nms, drop = FALSE]))
        key = key[order(id[uid])]
        tgt = do.call(paste, unname(grid[, nms, drop = FALSE]))
        wgt = rep(0, nrow(grid))
        for (i in seq_along(key)) wgt[tgt = key[i]] = sum(data[['(weights)']][id = i])
    }
    grid[['.wgt.']] = wgt
    model.info = list(call = attr(data, 'call'), terms = trms, xlev = xlev)
    if (!is.null(mm <- basis$model.matrix)) {
        attr(mm, 'factors') = .smpFT(trms)
        model.info$model.matrix = mm
    }
    nst = .find_nests(grid, trms, coerced$orig, ref.levels)
    if (length(nst) > 0)
        model.info$nesting = nst
    misc$is.new.rg = TRUE
    misc$ylevs = NULL
    misc$estName = 'prediction'
    misc$estType = 'prediction'
    misc$infer = c(FALSE, FALSE)
    misc$level = 0.95
    misc$adjust = 'none'
    misc$famSize = nrow(grid)
    if (is.null(misc$avgd.over))
        misc$avgd.over = character(0)
    if (is.null(misc$sigma) && missing(sigma)) {
        sigma = suppressWarnings(try(stats::sigma(object), silent = TRUE))
        if (inherits(sigma, 'try-error'))
            sigma = NULL
        misc$sigma = sigma
    }
    if (is.null(misc$sigma) || (length(misc$sigma) = 0) || !is.na(misc$sigma[1]))
        misc$sigma = sigma
    post.beta = basis$post.beta
    if (is.null(post.beta))
        post.beta = matrix(NA)
    predictors = intersect(attr(data, 'predictors'), names(grid))
    if (!missing(counterfactuals))
        predictors = c(predictors, '.obs.no.')
    simp.tbl = environment(trms)$.simplify.names.
    if (!is.null(simp.tbl)) {
        names(grid) = .simplify.names(names(grid), simp.tbl)
        predictors = .simplify.names(predictors, simp.tbl)
        names(ref.levels) = .simplify.names(names(ref.levels), simp.tbl)
        if (!is.null(post.beta))
            names(post.beta) = .simplify.names(names(post.beta), simp.tbl)
        if (!is.null(model.info$nesting)) {
            model.info$nesting = lapply(model.info$nesting, .simplify.names, simp.tbl)
            names(model.info$nesting) = .simplify.names(names(model.info$nesting), simp.tbl)
        }
        environment(trms)$.simplify.names. = NULL
    }
    result = new('emmGrid', model.info = model.info, roles = list(predictors = predictors, responses = attr(data, 'responses'), multresp = multresp, nuisance = nuisance), grid = grid, levels = ref.levels, matlevs = matlevs, linfct = basis$X, bhat = basis$bhat, nbasis = basis$nbasis, V = basis$V, dffun = basis$dffun, dfargs = basis$dfargs, misc = misc, post.beta = post.beta)
    if (!missing(type)) {
        if (is.null(options))
            options = list()
        options$predict.type = type
    }
    if (!missing(nesting)) {
        result@model.info$nesting = lst = .parse_nest(nesting)
        if (!is.null(lst)) {
            nms = union(names(lst), unlist(lst))
            if (!all(nms %in% names(result@grid)))
                stop('Nonexistent variables specified in 'nesting'')
            result@misc$display = .find.nonempty.nests(result, nms)
        }
    }
    else if (!is.null(nst <- result@model.info$nesting)) {
        result@misc$display = .find.nonempty.nests(result)
        if (get_emm_option('msg.nesting'))
            message('NOTE: A nesting structure was detected in the ', 'fitted model:n    ', .fmt.nest(nst))
    }
    result = .update.options(result, options, ...)
    if (!is.null(hook <- misc$postGridHook)) {
        if (is.character(hook))
            hook = get(hook)
        result@misc$postGridHook = NULL
        result = hook(result, ...)
    }
    if (!missing(regrid)) {
        if (missing(wt.counter))
            wt.counter = 1
        result = regrid(result, transform = regrid, sigma = sigma, .collapse = collapse, wt.counter = wt.counter, ...)
        if (!is.null(collapse))
            result@misc$avgd.over = collapse
    }
    .save.ref_grid(result)
    result
})(object = NULL, wt.nuis = 'equal')

stop(data)

To receive assistance with this problem, please report the message above at: https://jasp-stats.org/bug-reports

jasp-stats / jasp-issues

ANOVA issues #725

Steps to reproduce: