JuliaAI / MLJ.jl

A Julia machine learning framework
https://juliaai.github.io/MLJ.jl/
Other
1.76k stars 157 forks source link

Measures for Multi-Target models #800

Closed leonardtschora closed 3 years ago

leonardtschora commented 3 years ago

Hi everyone,

It seems that no measure exists for Multi-Target models. Multi-Target models are obtainable as:

 models( x -> x.target_scitype <: MLJ.Table)  

9-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :docstring, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :prediction_type, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :input_scitype, :target_scitype, :output_scitype), T} where T<:Tuple}: (name = MultiTaskElasticNetCVRegressor, package_name = ScikitLearn, ... ) (name = MultiTaskElasticNetRegressor, package_name = ScikitLearn, ... ) (name = MultiTaskLassoCVRegressor, package_name = ScikitLearn, ... ) (name = MultiTaskLassoRegressor, package_name = ScikitLearn, ... ) (name = MultitargetKNNClassifier, package_name = NearestNeighborModels, ... ) (name = MultitargetKNNRegressor, package_name = NearestNeighborModels, ... ) (name = MultitargetLinearRegressor, package_name = MultivariateStats, ... ) (name = MultitargetNeuralNetworkRegressor, package_name = MLJFlux, ... ) (name = MultitargetRidgeRegressor, package_name = MultivariateStats, ... )

but when I try to query the appropriate metrics:

measures( x -> x.target_scitype <: MLJ.Table)

NamedTuple{(:name, :instances, :human_name, :target_scitype, :supports_weights, :supports_class_weights, :prediction_type, :orientation, :reports_each_observation, :aggregation, :is_feature_dependent, :docstring, :distribution_type), T} where T<:Tuple[] I obtain no results.

Moreover, the following code won't work:

using DataFrames, MLJ

# Load Model
Model = @load MultitargetRidgeRegressor
model = Model()

# Generate data
n = 100
d = 5
o = 3
X = DataFrame(rand(Float64, n, d), :auto)
y = DataFrame(x1 = X.x1, x2 = X.x2, x3 = X.x3)

# Fit the machine for a check
mach = machine(model, X, y)
fit!(mach)

evaluate!(mach; resampling=CV(), measure=myloss)

ERROR: ArgumentError: scitype of target = Table{AbstractVector{Continuous}} but target_scitype(MeanAbsoluteError @543) = Union{AbstractVector{Continuous}, AbstractVector{Count}}.

I currently use the workaround:

mymae(X, y) = abs.(MLJ.matrix(X) .- MLJ.matrix(y)) |> mean
evaluate!(mach; resampling=CV(), measure=mymae)

Which is working fine but duplicates the core code of the mae function.

Thanks for your help!

ablaom commented 3 years ago

@Leonardbcm Thanks for that.

You are correct. There are no such measures yet, and an open issue https://github.com/alan-turing-institute/MLJBase.jl/issues/502

There is a little design to be worked out here. Feel free to make a proposal at that issue, or add specific requirements you may have.