JuliaStats / GLM.jl

Generalized linear models in Julia
Other
584 stars 114 forks source link

Version 2.0 Breaking Changes #500

Open palday opened 1 year ago

palday commented 1 year ago

I've noticed that some packages rely on TableRegressionModel to support GLM: https://github.com/jmboehm/RegressionTables.jl/issues/128, https://github.com/yufongpeng/AnovaBase.jl/issues/52 and https://github.com/yufongpeng/AnovaGLM.jl/issues/6. Even if they adapt to support the new approach, we'd better bump version to 2.0 to avoid any breakage. That can also be the occasion to drop some long-deprecated API. We should check whether we would like to make any other breaking changes. (A few other packages use TableRegressionModel for their own models, it would be good that they also stop using it but there's no hurry.)

Originally posted by @nalimilan in https://github.com/JuliaStats/GLM.jl/issues/339#issuecomment-1242947968

Here's a quick list of potentially issues that we might want to try to address as part of a push towards 2.0. Several are relatively straightforward, some could potentially be solved via more extensive documentation, and some will require Decisions to be made (e.g. all the stuff with weights).

There are several other issues I would like to see addressed sooner rather than later, but all are technically nonbreaking, at least under ColPrac guidelines (e.g., changes to the show methods, as raised in #461 and #469).

mousum-github commented 1 year ago

Right now, we are working on GLM with QR decomposition in two steps

  1. LM with QR
  2. GLM with QR and target is to complete by this calendar year.

Hope this will solve some issues related to the PosDefException as mentioned above.

mousum-github commented 1 year ago

I would like to have Multiple dependent variables, and Quasi Likelihood in GLM 2.0

nalimilan commented 1 year ago

Nice to hear you're working on QR! I think we can wait until you finish that before tagging 2.0. OTC, multiple dependent variables and quasi-likelihood do not change current behavior so they can be added later (and we have to discuss whether they should live in this package or in a separate one).

nalimilan commented 1 year ago

I don't think we should do anything about https://github.com/JuliaStats/GLM.jl/issues/259. Anyway https://github.com/JuliaStats/GLM.jl/pull/487 will change nobs to return an integer, as now the presence of weights is part of the type so there's no type instability. People can use size(modelmatrix(m), 1) to find out the number of rows in the matrix if they need that information.

483, https://github.com/JuliaStats/GLM.jl/issues/255 and https://github.com/JuliaStats/GLM.jl/issues/240 would be good to have, but not breaking AFAICT.

jariji commented 1 year ago

I hope 2.0 fixes https://github.com/JuliaStats/GLM.jl/issues/496 and throws an error on missing values to protect users from making analytical errors by accident. lm(...; skipmissing=true) seems fine to me.