Open PharmCat opened 2 years ago
Is it possible to make:
struct ContrastsMatrix{C <: AbstractContrasts, T, U, M}
matrix::M
termnames::Vector{U}
levels::Vector{T}
contrasts::C
invindex::Dict{T,Int}
function ContrastsMatrix(matrix::M,
termnames::Vector{U},
levels::Vector{T},
contrasts::C) where {U,T,C <: AbstractContrasts} where M <: AbstractMatrix
allunique(levels) || throw(ArgumentError("levels must be all unique, got $(levels)"))
invindex = Dict{T,Int}(x=>i for (i,x) in enumerate(levels))
new{C,T,U,M}(matrix, termnames, levels, contrasts, invindex)
end
end
@PharmCat how many contrast levels do you have? If this is for the grouping variable in MixedModels.jl, then there is the Grouping()
pseudocontrast which avoids creating an actual matrix
@PharmCat how many contrast levels do you have? If this is for the grouping variable in MixedModels.jl, then there is the
Grouping()
pseudocontrast which avoids creating an actual matrix
@palday
Hello! It can be more than 10^5. Actually I'am working on Metida.jl, that helps me in some tasks where MixedModels.jl can't be used. I know that in MixedModels this problem solved, Metida have some "workaround" too. And I see 'Grouping' in MixedModels.jl and may be 'Grouping' code should be moved to StatsModels.jl and documented there (may be with some other code from MixedModels, such using "/" in terms). Also I don't know why ContrastsMatrix matrix field set as Matrix{Float64}, why in can't be more flexible.
So also I can't find any roadmap for StatsModels, I think StatsModels is a core package for JuliaStats ecosystem, but have no information about it's development plan to version 1.0
The nesting syntax /
is implemented in RegressionFormulae.jl
The implementation of Grouping()
is quite simple: https://github.com/JuliaStats/MixedModels.jl/blob/621f88b1f594ea0827d9ac7e8628113dd2121bef/src/grouping.jl#L2-L34
Depending on the exact structure of your model, you might be able to skip using the full formula infrastructure and instead call a custom modelcols
method directly -- this is how random effects and associated sparse matrices are constructed in MixedModels.
The implementation of
Grouping()
is quite simple:
Yep, but this means that I should copy this code or include MixedModels as a dependency. Maybe place this functionality in StatsModels?
There's nothing wrong with copying this code, but maybe @kleinschmidt has thoughts on whether it makes more general sense to include this in StatsModels?
Why matrix field of struct ContrastsMatrix is Matrix{Float64}? For many cases fo DummyCoding() or FullDummyCoding() this can be BitMatrix or SparseMatrixCSC{Bool, Int64}. For big datasets I try to make something like this:
But I have memory overflow because ContrastsMatrix tries to convert this to Matrix{Float64}.