Closed davidbp closed 3 years ago
Thanks for reporting.
According to the error message:
KNNClassifier @449 <: Probabilistic but prediction_type(MisclassificationRate @021) = :deterministic.
Your model predicts probability distributions but misclassification_rate
is for deterministic predictions:
julia> info(KNNClassifier).prediction_type
:probabilistic
julia> info(misclassification_rate).prediction_type
:deterministic
Your choices are: use a probabilistic measure, such as BrierScore()
or LogLoss()
; or to add the option operation=predict_mean
to the TunedModel
constructor (the default is predict
, which is giving the probabilistic predictions).
You could also put your KNN model in a @pipeline
with mean
at the end; something like pipe = @pipeline KNNClassifier mean
.
Makes a lot of sense, thank you.
I think a classification example like this one would enchance a lot the documentation.
Yeah, well, there's https://alan-turing-institute.github.io/DataScienceTutorials.jl/end-to-end/crabs-xgb/ and https://github.com/ablaom/MachineLearningInJulia2020/blob/master/tutorials.md#part-4---tuning-hyper-parameters but nothing on tuning a classifier in the main documentation.
I guess one could add an example to the "TuningModels" section https://github.com/alan-turing-institute/MLJ.jl/blob/dev/docs/src/tuning_models.md . PR welcome.
Pull request created #726 . Hopefully if you accept the update there will be less issues like this one (or such as #126 ).
Describe the bug A tunned KNN where I want to select the K using as measure
misclassification_rate
is not fitted.To Reproduce
Expected behavior I would expect a fitted mathine where I can inspect the missclassication_rate in the CV results.
If there are restrictions in the type of functions that a TunedModel can use I could not find them in the documentation. I use a measure from performance_measures so I expected it to work.
Also a question arises:
y
,y_hat
computes the loss? How can I pass it to a machine and make it work? In the documentation I clearly see that the code can get as input a user defined loss with the contract that "lower is better". If that is not the case, the user should specify that the passed function is a "score" (higher is better for selecting parameters).Additional context The same code provided above removing
measure
, that is:works as expected
Output of the provided script