Closed SteveBronder closed 8 years ago
Related to #1121.
This is trivial to do yourself though; all you need is an ifelse
on the predictions.
Also there are plenty of learners that won't extrapolate outside the empirical support of the target.
@larskotthoff I think that's a much more complicated version of what I'm requesting. During prediction if a prediction value is -1.5, but we know the target variable's values cannot be less than 0, we would just do a check and correct for those values out of our bounds.
@zmjones this is true, but there are others that can! When I was using caret irl I found that setting prediction bounds explicitly gave me a much more reasonable model. Plus, if anyone ever uses this in business it's something that the user would probably have to set themselves anyway.
I think adding predict.bounds
in the regression task and makePrediction() could be useful
Am I able to have multiple forks of mlr on github? If so I can add this and make a pull request.
Sure I can see the utility. No I don't think you can have multiple forks. I would just create another branch locally, pull from the master branch here, and then push to a new branch on your forked repo and then issue a pr.
You can have a fork per account, so you could just register another account.
@larskotthoff @zmjones
I'm going to close this for now, once my current fork is implimented I will add this
just adding a little note here as the discussion seems to be over:
i do see @Stevo15025's point. and its not a simple ifelse, as you need to basically either have this as on option in the learner or a wrapper.
PS: altough i would guess that doing something similar like isotonic regression is then still a better approach then hard capping
@berndbischl We could also do it in a pre-processing function if we allowed pre-processing schemes to do things to the predictions :-P
Also, #random allowing pre-processing schemes on predictions would let me add the Lambert W transforms
Just as an example it could look something like
makePreprocWrapperPostPred = function(learner, par1 = foo, par2 = bar, par3 = doo) {
trainfun = function(data, target, args = list(par1, par2)) {
blah blah blah
}
predictfun = function(data, target, args, control) {
blah blah
}
post.pred.fun = function(prediction, args, par3, control){
do stuff post prediction here
}
makePreprocWrapper(
learner,
train = trainfun,
predict = predictfun,
post.predict = post.predict.fun,
par.set = makeParamSet(
blah blah blah
),
par.vals = list(par1 = par1, par2 = par2, par3 = par3)
)
}
shouldnt we separate this? but simply let the user add a "post processor" for the predictions?
caret has a parameter in
trainControl()
for specifying the prediction boundsThis can be very nice when you are working with financial data as you know values cannot be negative. Would it be possible to implement this in mlr?
My first thought was to have it be in
makeRegrTask()
as we are specifying for our prediction task there are given bounds. Then the process would follow:prediction.bounds
to the new trained model sopredict()
can be made aware of the bounds.makePrediction.TaskDescRegr()
will get the prediction bounds and do 'clean up' there.Any thoughts on this? Maybe it could be passed like a preprocessing function?