JuliaStats / StatsModels.jl

Specifying, fitting, and evaluating statistical models in Julia
248 stars 30 forks source link

Feature Request: Poly #84

Open mileslucas opened 5 years ago

mileslucas commented 5 years ago

Hello, I'm reaching out to request a part of the modelling framework that is similar to R's poly. I think this would be very beneficial within linear modeling as well as other places

This is related to both #29 and #21 but I haven't seen any recent updates regarding this specific feature. Are there people actively working on it? My knowledge of the R poly is not good, so I wouldn't be a great person to tackle the problem, but I'm interested in using it and I'm sure there are many others.

kleinschmidt commented 5 years ago

You'll be interested to know that something that (roughly) approximates that is included as a test case for adding features to the formula language in #71: https://github.com/JuliaStats/StatsModels.jl/pull/71/files/39b413a0b7e70660d0aa8ac88c547a0a96bf4216#diff-0eed3b85962bbde3d2a1de890bdac3f8

Once that PR is merged we'll probably include something like poly in this package.

kleinschmidt commented 5 years ago

(Which is to say that a PR to that effect would definitely be welcome once #71 is merged which should be soon, and the documentation/tests from #71 show how to do it which should be enough to get you started on a proper implementation :smile: )

mestinso commented 1 year ago

@kleinschmidt Any update on this one (I see #71 was merged)? I think it would be really great to see poly available out of the box here.

On a related note, I also think it would be great to see a way to easily specify full quadratic (or higher) models with multiple inputs, i.e.: y = a + b*x1 + c*x1^2 + d*x2 + e*x2^2 + f*x1*x2. I think R uses polym to do this.

My current approach is to define poly (following the example in the documentation) and then do use @formula(y ~ 1 + poly(x1,2) + poly(x2,2) + x1&x2) which isn't so bad. however, if I want to move to full cubic it gets a bit messier.

kleinschmidt commented 1 year ago

Its unlikely that we'll host it here but maybe in RegressionFormulae.jl?The multi poly one is interesting, I bet there's a way we can implement it with combinatorics(please pardon my thumb-typing)On Mar 4, 2023, at 11:40, Matt Stinson @.**> wrote: Any update on this one (I see #71 was merged)? I think it would be really great to see poly available out of the box here. On a related note, I also think it would be great to see a way to easily specify full quadratic (or higher) models with multiple inputs, i.e.: y = a + bx1 + cx1^2 + dx2 + ex2^2 + fx1*x2. I think R uses polym to do this. My current approach is to define poly (following the example in the documentation) and then do use @formula(y ~ 1 + poly(x1,2) + poly(x2,2) + x1&x2) which isn't so bad. however, if I want to move to full cubic it gets a bit messier.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

mestinso commented 1 year ago

@kleinschmidt Thanks for replying. Can you educate me on the reason for any hesitation around putting this in StatsModels.jl? In my opinion and experience, basic polynomials are both fundamental and extremely common in curve fitting, so it seems like a real shame to me to not have a standard/easy/built-in way to do this in the Julia ecosystem. Certainly, if I look at other scientifically oriented languages (matlab, R, python/scipy), there are nearly standard and what feels like trivial packages/toolboxes that enable this sort of curve-fitting functionality without too much fuss.

Assuming there is a good reason to keep this out of StatsModels.jl, I would comment that it would be nice if something like RegressionFormulae.jl became maintained by julia stats and was mentioned in the StatsModels.jl documentation as a recommended "extras" package.

ararslan commented 1 year ago

FWIW I think it makes sense for poly to live in StatsModels. There are packages that provide case-specific extensions to the formula DSL, e.g. MixedModels, but poly to me seems like a more "core" piece since it's a simple expansion of the given terms rather than something specific to a particular model type.

PharmCat commented 1 year ago

@kleinschmidt Thanks for replying. Can you educate me on the reason for any hesitation around putting this in StatsModels.jl? ...

@mestinso Excuse me for my sarcasm :) but I think, that even all maintainers will be agreed - this issue will be closed approximately in 5 years... I think because this package really supported only by @kleinschmidt and @nalimilan (great thanks!) ... so, as I understand, there is no purpose to realize all basic statistical concepts in StatsModels (like nested terms, multivariate response ets..)... Because if StatsModels.jl will be big enough - it can't be supported further only by two people. That why nobody want to add new features (I suppose)... also (IMHO) StatsModels.jl have one more problem - it was designed to solve narrow list of tasks and it is very hard to extend and any feature that not in design's concept can't be realized. So may be if this package will become a priority for julia community after years we will see StatModels 2.0 that will include all enough statistical abstractions for professional use as primary instrument in basic pipeline.