EdwinKipruto / mfp2

3 stars 0 forks source link

Categorical variables in formula interface #48

Open matherealize opened 1 year ago

matherealize commented 1 year ago

Currently the fp() function does not support categorical variables.

This means that the matrix interface and formula interface are not equivalent, because the latter does not allow to set selection level for categorical variables. All other parameters of fps are indeed irrelevant, but the selection parameter applies also to categorical variables, e.g. to keep a variable fixed in the model (for example to adjust a model for sex).

Should we also allow fp() for categories but set all values to reasonable values then? mfp allows the use of fp terms even for categories, even though it makes no sense to set df to anything else but 1 etc...

Alternatively, should we introduce another term that only allows to do this for categories?

EdwinKipruto commented 1 year ago

fp() can be used with categorical variables after the user has created dummy variables. Then the parameters can be fixed using fp() function.

matherealize commented 1 year ago

But this is extremely user unfriendly and very surprising, given that the formula interface is supposed to do that work for you. I still suggest to change this behaviour. Do you see any problems with that when we change the fp function accordingly, and make sure everything still works in mfp2.formula?

EdwinKipruto commented 1 year ago

You can remove the condition that stops fp() from taking as input factor variables but you have to rework on the entire codes in particular mfp2.formula(). It might not be such difficult but it needs some time to distinguish between variables that undergo fp transformation and categorical variables

On Wed 24. May 2023 at 16:30, Michael Kammer @.***> wrote:

But this is extremely user unfriendly and very surprising, given that the formula interface is supposed to do that work for you. I still suggest to change this behaviour. Do you see any problems with that when we change the fp function accordingly, and make sure everything still works in mfp2.formula?

— Reply to this email directly, view it on GitHub https://github.com/EdwinKipruto/mfp2/issues/48#issuecomment-1561271928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM233ZA3JAFKVIRBVSWRAXDXHYLQLANCNFSM6AAAAAAYNOP574 . You are receiving this because you commented.Message ID: @.***>

matherealize commented 1 year ago

Then I think we can add another attribute, "is.factor" to store that information. That should simplify things.