Closed ablaom closed 3 years ago
The link in This PR
in the first paragraph is not working. Could you tell me where is the info about allowing an input sparse matrix?
I will test that the core handles adjoints as well. I still have to add tests for sparse matrices. I know that the code worked a year ago with sparse inputs but the sparse operations did not bring much of a speedup. Nevertheless I also recall the core sparse operations improved a lot since then.
I just tested the code with adjoints and it works as expected.
I have updated all the examples in examples folder where now we can easily see different types of classification problems with different input types, all of them seem to work (in branch MLJModelInterface)
In particular:
05_MPmachine_iris.jl
shows that a machine can be fitted with a standard julia array (even with an adjoint).06_MPmachine_mnist.jl
shows that a machien can be fitted with a standard Dataframe.Here I leave all the outputs of the examples:
julia --project=. examples/01_MPCore_iris.jl
Iris Dataset, MulticlassPerceptronCore
Types and shapes before calling fit!(perceptron, train_x, train_y)
typeof(perceptron) = MulticlassPerceptronCore{Float32}
typeof(X) = LinearAlgebra.Adjoint{Float64,Array{Float64,2}}
typeof(y) = Array{Int64,1}
size(X) = (4, 150)
size(y) = (150,)
n_features = 4
n_classes = 3
Start Learning
Learning took 0.859 seconds
Results:
Train accuracy:0.973
julia --project=. examples/02_MPCore_mnist.jl
MNIST Dataset, MulticlassPerceptronCore
Loading data
MNIST Dataset Loading...
MNIST Dataset Loaded, it took 0.459 seconds
Types and shapes before calling fit!(perceptron, train_x, train_y)
typeof(perceptron) = MulticlassPerceptronCore{Float32}
typeof(train_x) = Array{Float32,2}
typeof(train_y) = Array{Int64,1}
size(train_x) = (784, 60000)
size(train_y) = (60000,)
size(test_x) = (784, 10000)
size(test_y) = (10000,)
n_features = 784
n_classes = 10
Start Learning
Learning took 7.125 seconds
Results:
Train accuracy:0.936
Test accuracy:0.926
julia --project=. examples/03_MPClassifier_iris.jl
Iris Dataset, MulticlassPerceptronClassifier
Iris Dataset Example
Types and shapes before calling fit(perceptron, 1, train_x, train_y)
typeof(perceptron) = MulticlassPerceptronClassifier
typeof(X) = DataFrame
typeof(y) = CategoricalArray{String,1,UInt8,String,CategoricalString{UInt8},Union{}}
size(X) = (150, 4)
size(y) = (150,)
n_features = 4
n_classes = 3
Start Learning
Epoch: 50 Accuracy: 0.9
typeof(fitresult) = Tuple{MulticlassPerceptronCore{Float32},MLJBase.CategoricalDecoder{String,UInt8}}
Learning took 2.298 seconds
Results:
Train accuracy:0.98
julia --project=. examples/04_MPClassifier_mnist.jl
MNIST Dataset, MulticlassPerceptronClassifier
Loading data
MNIST Dataset Loading...
MNIST Dataset Loaded, it took 0.487 seconds
Types and shapes before calling fit(perceptron, 1, train_x, train_y)
typeof(perceptron) = MulticlassPerceptronClassifier
typeof(train_x) = LinearAlgebra.Adjoint{Float32,Array{Float32,2}}
typeof(train_y) = CategoricalArray{Int64,1,UInt32,Int64,CategoricalValue{Int64,UInt32},Union{}}
size(train_x) = (60000, 784)
size(train_y) = (60000,)
size(test_x) = (10000, 784)
size(test_y) = (10000,)
n_features = 60000
n_classes = 10
Start Learning
Epoch: 50 Accuracy: 0.897
typeof(fitresult) = Tuple{MulticlassPerceptronCore{Float32},MLJBase.CategoricalDecoder{Int64,UInt32}}
Learning took 8.044 seconds
Results:
Train accuracy:0.936
Test accuracy:0.926
julia --project=. examples/05_MPmachine_iris.jl
Iris Dataset, Machine with a MulticlassPerceptronClassifier
Iris Dataset Example
Types and shapes before calling fit!(perceptron_machine)
typeof(perceptron_machine) = Machine{MulticlassPerceptronClassifier}
typeof(X) = DataFrame
typeof(y) = CategoricalArray{String,1,UInt8,String,CategoricalString{UInt8},Union{}}
size(X) = (150, 4)
size(y) = (150,)
n_features = 4
n_classes = 3
Start Learning
[ Info: Training Machine{MulticlassPerceptronClassifier} @ 1…53.
Epoch: 50 Accuracy: 0.94Learning took 11.895 seconds
Results:
Train accuracy:0.98
julia --project=. examples/06_MPmachine_mnist.jl
MNIST Dataset, Machine with a MulticlassPerceptronClassifier
MNIST Dataset Loading...
MNIST Dataset Loaded, it took 0.545 seconds
Types and shapes before calling fit!(perceptron_machine)
typeof(perceptron_machine) = Machine{MulticlassPerceptronClassifier}
typeof(train_x) = LinearAlgebra.Adjoint{Float32,Array{Float32,2}}
typeof(train_y) = CategoricalArray{Int64,1,UInt32,Int64,CategoricalValue{Int64,UInt32},Union{}}
size(train_x) = (60000, 784)
size(train_y) = (60000,)
size(test_x) = (10000, 784)
size(test_y) = (10000,)
n_features = 784
n_classes = 10
Start Learning
[ Info: Training Machine{MulticlassPerceptronClassifier} @ 1…65.
Epoch: 50 Accuracy: 0.898
Learning took 10.5 seconds
Results:
Train accuracy:0.936
Test accuracy:0.926
In particular:
05_MPmachine_iris.jl shows that a machine can be fitted with a standard julia array (even with an adjoint).
?? I don't see any array being used in the machine here - only a DataFrame (see your own output).
But I'm unclear what you think of my proposal? Should the MLJ user (as opposed to Core user) supply his machine with an p x n matrix (as now) or would you be happy to make the suggested changes that then require instead this to be an n x p matrix?
To be clearer, I expect the following works just fine at the moment:
using MLJ
X, y = @load_iris # X is a table with 4 columns, one per feature
A_wide= permutedims(MLJ.matrix(X)) # Matrix{Float64} with 4 rows
machine(perceptron, A_wide, y) |> fit!
However, I should prefer that this work instead:
A_tall = MLJ.matrix(X) # Matrix{Float64} with 4 columns
machine(perceptron, A_tall, y) |> fit!
and also this (with better performance, if we ignore the "one-time" cost of permutedims
):
X, y = @load_iris # X is a table with 4 columns, one per feature
A_wide= permutedims(MLJ.matrix(X)) # Matrix{Float64} with 4 rows
machine(perceptron, A_wide', y) |> fit! # <----- note the adjoint!
?? I don't see any array being used in the machine here - only a DataFrame (see your own output).
Sorry I was referring to the 06 experiment.
But I'm unclear what you think of my proposal? Should the MLJ user (as opposed to Core user) supply his machine with an p x n matrix (as now) or would you be happy to make the suggested changes that then require instead this to be an n x p matrix?
I assume an MLJ user would follow the convention of an n x p matrix. The shapes of the arrays/Dataframes from examples 03 to 06 precisely follow this convention. Note that the examples 01 and 02 are fore the Core version (which is standalone and independent from MLJ). Isn't this what you would prefer?
I assume an MLJ user would follow the convention of an n x p matrix.
Yes, but my point is that this is not what is currently implemented. The fix is #8.
With #8 merged, I think we can close this.
In MLJ a table is always assumed to be features-as-columns. If we are allowing the MLJ user to input a matrix (possibly sparse) as in this PR instead, then for consistency this ought to be features-as-columns as well, but the core methods expect features-as-rows.
To make the interface consistent, one could change this line to
_reformat(X, ::Type{<:AbstractMatrix}) = X'
(ie, add adjoint). If the MLJ user supplies his inputX
(from MLJ) as the adjoint of a features-as-rows matrix, then the two adjoint operations will compile to a no-operation, and there will be no loss of performance.I'm kind of assuming here that the MulticlassPerceptron core method can handle any AbstractMatrix, including adjoints, which it probably should be capable of doing. Moreover, it can presumably detect when the user has passed data in an non-optimal format, and issue an
@info
recommending an alternative representation (if verbosity > 0).