Closed ablaom closed 4 years ago
Originally it was this way. The tricky part it is not easy to reliably predict what the final output is. After a probablistic classifier, for example, there might be a function that just computes the mode (or some threshold-based point-prediction), or the a final function may just transform the predicted pdfs. So we really don’t know.
We could have the macro make a best guess of the prediction type (assume it is the same as the single supervised model in the pipeline) and leave the keyword for over-riding default behaviour.
Thoughts?
Incidentally, for uniformity with changes to traits in MLJBase, the keyword should really be "prediction_type = :probabilist" (even more to write!). Perhaps just ":probabilistic" is enough.
cc: @tlienart
hmm yes I understand; on the other hand a pipeline is reasonably simple in that it's just a tube with operations in order, so we can inspect whatever is at the end right? to follow your line of thinking:
in the "it's an operation", exported functions like predict_mean
, predict_mode
or predict_thresh
(yet to be defined) should be recognised and marked as deterministic.
If we allow arbitrary operations at the end (?) I think it would be fair to just make a guess based on the last step which we recognise and warn the user that they should specify is_probabilistic
otherwise?
So I guess the line of thinking is similar except that you don't seem to include "recognising" operations if that's what's at the end of the pipe
Partly addressed. Closing in favour of https://github.com/alan-turing-institute/MLJBase.jl/issues/267
From a slack thread:
Would it be imaginable to remove the is_probabilistic=true and guess it instead given the last step? (i.e. either the last model is probabilistic or there’s an operation which is something specific) it feels pretty clunky to have to specify it. (edited)