mlr-org / paradox

ParamHelpers Next Generation
https://paradox.mlr-org.com
GNU Lesser General Public License v3.0
28 stars 7 forks source link

Conditional limits #234

Open pat-s opened 5 years ago

pat-s commented 5 years ago

To make the limit dependent on something, e.g. number of features. mtry is a very good example for that. But there are more pars that do operate on n.feats in some way and for which this value should not be exceeded.

I think this related to #215.

jakob-r commented 5 years ago

This won't happen. We had this topic in ParamHelpers and in the end we came to the conclusion that it is more purposeful to add a parameter that is ranged from 0 to 1 and will be transformed data dependently later.

How and when this trafo is done the best way is stil not 100% clear to me. See your linked Issue.

Would you be okay if we close this issue?

pat-s commented 5 years ago

I see. Thanks.

more purposeful to add a parameter that is ranged from 0 to 1 and will be transformed data dependently later.

Is this already supported? I took a look at mtry from ranger but it was not used there.

Lots of filter hyperpars depend on n.feats and missing this conditional specification in paradox blocks the whole usage of ParamSets in mlr3featsel currently.

This discussion in the linked issue did not continue since February. Seems like a pressing issue (for both learners and filters), how are we gonna continue with this?

Would you be okay if we close this issue?

Yes. Maybe pin the other issue or write a short summary about the current status? I think this is a very important topic that other people might try to find as well. Searching by issue title is not always successful.

mb706 commented 5 years ago

I took a look at mtry from ranger but it was not used there.

I think the (informal) consensus on the usage of paradox for Learners is that the Params should reflect the underlying function parameters, so if ranger::ranger has an mtry argument that gives the integer over the number of features to sample, the ParamSet should reflect that to spare users to have to learn about our special transformations we do. Absent of my solution in #215, writing another implementation that calculates mtry in a task-dependent way would still be possible, this was done in old mlr with classif.ranger.pow.

Conditional limits are a difficult story because the parameter sets are supposed to be used for tuning, and going beyond box-constraints when tuning gets difficult fast. Consider tuning a method that does feature filtering and then fits a ranger: the upper limit of mtry would depend on the nfeat parameter of the filtering. It gets even more complicated when features are filtered according to a filter value threshold.

I think the best solution is to use no limit for nfeat, because the filters would still have a frac parameter with defined limits.

pat-s commented 5 years ago

the upper limit of mtry would depend on the nfeat parameter of the filtering.

This is exactly what I want since otherwise the tuning errors with mtry being out of bounds. That's why I also use classif.ranger.pow in my studies in which I combine tuning and filtering.

It gets even more complicated when features are filtered according to a filter value threshold.

Well yes, in this case we need to account for it by replacing a possibly hardcoded nfeat default for mtry by the chosen filter idea (nperc, ntresh, nabs).

There needs to be some auto-adjustment of nfeat dependent hyperpars in some way, otherwise tuning+filtering won't work automatically in a nested setting.

I think the (informal) consensus on the usage of paradox for Learners is that the Params should reflect the underlying function parameters, so if ranger::ranger has an mtry argument that gives the integer over the number of features to sample

Ok. But then the default in the ParamSet should also reflect this, i.e. being nfeat and not no limit?

berndbischl commented 5 years ago

this is

--> it is an issue to discuss during the workshop