Open tlienart opened 4 years ago
Thank you for your notes,
I have some issues with respect to testing (with respect to the implementation of the learning algorihtm).
In the case of regression models you can use a closed form solution such as θ = (X1'X1 + λ*I)\(X1'y)
. This does not exist (as far as I know) for a multiclass perceptron.
A possible solution could be to generate a bunch of datasets of learning problems, ensure that they are linearly separable, and test that the model can get a perfect accuracy on those problems. What do you think about this possible solution? I actually opened an issue on MLJ for having a piece of code to generate datasets. I haven't done this yet but I would like to have something similar to this.
With respect to testing vs MLP in sklearn I guess you mean with the Perceptron implementation (not the MultilayerPerceptron) of sklearn (the implementation in this repo has no hidden units). This is not an apple to apple comparisson but it could give some confidence.
I have seen in the logistic case you use: @test mcr ≤ 0.2
. Maybe I could train in the same dataset a perceptron version of sklearn and get a threshold?
A possible solution could be to generate a bunch of datasets of learning problems, ensure that they are linearly separable, and test that the model can get a perfect accuracy on those problems. What do you think about this possible solution? I actually opened an issue on MLJ for having a piece of code to generate datasets but I did not code it.
Yes a first sanity check is to generate points in a low dimensional spaces that are very separated and make sure they're predicted to be in the right class with high accuracy
More generally you want two kinds of tests:
I don't completely follow your point about hidden units.
Basically yes compare to Sklearn's perceptron which can also do multiclass by using one-vs-all type scoring. Definitely first start with the binary case and make sure you get comparable performances in several scenarios, then move on to the multiclass one. Check that your performances are within a factor of 2-3 of the python implementation, ideally they would be faster of course but let's say if it's 5x slower then we wouldn't really want to add that to the registry. So yeah testing+benchmarks if that makes sense.
Thanks!
Since I haven't registered any package in julia I am not sure I have properly done this. I can see I have generated a pull request here:
https://github.com/JuliaRegistries/General/pull/6370
Now I need to make the issue to MLJModels
It looks good to me. Because this is a new package, it may take a few days. I think there is a stand-down period of some days for new packages to allow comment. Future versions are usually auto-merged within half an hour, assuming they meet the auto-merge criterion.
@davidbp are you still interested in including this MLP in MLJ?
@azev77 it´s been quite a while, I would like to finish the interface now.
Let me know if I can assist.
Now that MLJFlux is working, don’t we have access to MLPs via Flux? @ablaom
Yep.
I have updated the tests with an example of a linearly separable case with several classes to verify that the model can separate it perfectly (as expected).
Also notebooks/learning_tests.ipynb
shows graphically what we would expect from this model.
There is only something that I do not like, @ablaom maybe you can suggest how to solve this.
I have a simple struct MulticlassPerceptronCore
with a fit
and predict
methods. Since I import methods with the same name in the "Core version" and the "MLJ version" they conflict and I have to use MLJbase.predict
and MulticlassPerceptron.predict
.
This can be seen in the test cases:
@testset "MulticlassPerceptronCore (tests model implementation)" begin
@testset "n=100, p=2, n_classes=2, sparse=false, f_average_weights=true" begin
Random.seed!(6161)
n, p, n_classes, sparse, f_average_weights = 100, 2, 2, false, true
X, y = MulticlassPerceptron.make_blobs(n; centers=n_classes, random_seed=4, return_centers=false)
X = copy(X')
perceptron_core = MulticlassPerceptronCore(Float64, n_classes, p, sparse)
fit!(perceptron_core, X, y; n_epochs=100, f_average_weights=true)
ŷ = MulticlassPerceptron.predict(perceptron_core, X)
@test length(ŷ)==length(y)
@test mean(ŷ .== y)>0.95
end
end
If I write the same code as before using simply predict
it does not work.
Yes, if you are importing both predict
s into the same namespace, then you need to qualify them. I don't see that this a big deal. In practice, if your model is registered in MLJ then, in a normal MLJ workflow, the user will never do using MulticlassPerceptron
and hence create the conflict of MulticlassPerceptron.predict
with MLJ.predict = MLJ.MLJModelInterface.predict
. In the standard workflow, only the type MulticlassPerceptronClassifier
gets pulled into the user's namespace, for that is all they will need.
FYI MLJ finds this type from the load_path
trait (which looks good to me) and you can even drop export MulticlassPerceptronClassifier
if you want.
If the conflict really bothers you, you can overload MLJModelInterface.predict
in your core code instead of overloading StatsBase.predict
, which is what you are doing currently. So change https://github.com/davidbp/MulticlassPerceptron.jl/blob/ab8f85c1d05ba7d656ea58edff15703bbd0cbe90/src/MulticlassPerceptronCore.jl#L7 to
import StatsBase: fit!
import MLJModelInterface.predict
The reason MLJModelInterface does not overload StatsBase.predict is that it aims to be an ultra-lightweight package, so we don't want StatsBase as a dependency.
By the way, you ought to be able to remove MLJBase as a dependency in your code (which will speed up loading considerably, I expect) but replacing MLJBase references with MLJModelInterface references. You will still need MLJBase as a test dependency, however https://alan-turing-institute.github.io/MLJ.jl/dev/quick_start_guide_to_adding_models/#Important-1
Does this help?
It helps a lot, thank you.
I just removed the explicit MLJBase.predict
and MulticlassPerceptron.predict
from the tests and with your suggested change works as expected (and all tests pass).
Neverhteless, I think I still need MLJBase in interface.jl
for the following function:
function MLJModelInterface.predict(fitresult::Tuple{MulticlassPerceptronCore, MLJBase.CategoricalDecoder}, Xnew)
# Function fit!(MulticlassPerceptronCore, X, y) expects size(X) = n_features x n_observations
if Xnew isa AbstractArray
Xnew = MLJBase.matrix(Xnew)
elseif Tables.istable(Xnew)
Xnew = MLJBase.matrix(Xnew, transpose=true)
end
result, decode = fitresult
prediction = predict(result, Xnew)
return decode(prediction)
end
Since Categoricaldecoder
does not seem to be in MLJModelInterface.
That's odd. MLJModelInterface.decoder
is defined. It is used in this XGBoost interface: https://github.com/alan-turing-institute/MLJXGBoostInterface.jl/blob/2e31da281e74c9532d5befac3e75dbd97cf9f3b2/src/MLJXGBoostInterface.jl#L642 which does not have MLJBase as dependency. (MMI
is an alias for MLJModelInterface
).
Perhaps you can provide more detail about the error you are getting.
If I substitute all references in interface.jl
of MLJBase
for MLJModelInterface
I get
ERROR: LoadError: LoadError: UndefVarError: CategoricalDecoder not defined
since MLJModelInterface.CategoricalDecoder
is not found. This breaks
function MLJModelInterface.predict(fitresult::Tuple{MulticlassPerceptronCore, MLJModelInterface.CategoricalDecoder}, Xnew)
since it cannot be compiled because it depends on MLJModelInterface.CategoricalDecoder
.
Note that
julia> typeof( MLJModelInterface.decoder(y[1]))
MLJBase.CategoricalDecoder{Int64,UInt32}
So it is true that MLJModelInterface.decoder
exists, but MLJModelInterface.CategoricalDecoder
does not exist.
Just remove the type annotation in your predict method (and anywhere else it is not needed). Generally, function argument type annotations should be as generic as possible, which sometimes means no annotation at all. Actually, I see this particular predict is just a private method, yes? I would rename it _predict
. So, something like this:
function MLJModelInterface.predict(model::MulticlassPerceptronClassifier, fitresult, Xnew)
if Tables.istable(Xnew)
Xnew = MLJBase.matrix(Xnew, transpose=true)
end
result, decode = fitresult
prediction = _predict(result, Xnew) # <------------ changed name to "private" name
return decode(prediction)
end
function _predict(fitresult, Xnew) # <----------- removed type annotation
# Function fit!(MulticlassPerceptronCore, X, y) expects size(X) = n_features x n_observations
if Xnew isa AbstractArray
Xnew = MLJBase.matrix(Xnew)
elseif Tables.istable(Xnew)
Xnew = MLJBase.matrix(Xnew, transpose=true)
end
result, decode = fitresult
prediction = predict(result, Xnew)
return decode(prediction)
end
This corresponds to lines 115 -138 at https://github.com/davidbp/MulticlassPerceptron.jl/blob/43803fbc4ab98068e2e47792228a107652123adf/src/mlj/interface.jl#L115
I tried what you suggest (change lines 115 to 138 with the code you provide above) but the current tests for the interface.
Test Summary: | Pass Total
MulticlassPerceptronCore (tests model implementation) | 3 3
n=100, p=2, n_classes=2: Error During Test at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:41
Got exception outside of a @test
MethodError: no method matching predict(::Tuple{MulticlassPerceptronCore{Float32},MLJBase.CategoricalDecoder{Int64,UInt32}}, ::Tables.MatrixTable{Array{Float64,2}})
Closest candidates are:
predict(!Matched::Union{DeterministicNetwork, ProbabilisticNetwork}, ::Any, !Matched::Any) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/composition/composites.jl:45
predict(!Matched::AbstractMachine, ::Any...) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/operations.jl:27
predict(!Matched::MulticlassPerceptronCore, ::Any) at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/src/MulticlassPerceptronCore.jl:120
...
Stacktrace:
[1] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:51
[2] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[3] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:42
[4] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[5] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:41
[6] include(::String) at ./client.jl:457
[7] top-level scope at none:6
[8] eval(::Module, ::Any) at ./boot.jl:331
[9] exec_options(::Base.JLOptions) at ./client.jl:272
[10] _start() at ./client.jl:506
n=100, p=10, n_classes=3: Error During Test at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:55
Got exception outside of a @test
MethodError: no method matching predict(::Tuple{MulticlassPerceptronCore{Float32},MLJBase.CategoricalDecoder{Int64,UInt32}}, ::Tables.MatrixTable{Array{Float64,2}})
Closest candidates are:
predict(!Matched::Union{DeterministicNetwork, ProbabilisticNetwork}, ::Any, !Matched::Any) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/composition/composites.jl:45
predict(!Matched::AbstractMachine, ::Any...) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/operations.jl:27
predict(!Matched::MulticlassPerceptronCore, ::Any) at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/src/MulticlassPerceptronCore.jl:120
...
Stacktrace:
[1] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:65
[2] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[3] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:56
[4] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[5] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:41
[6] include(::String) at ./client.jl:457
[7] top-level scope at none:6
[8] eval(::Module, ::Any) at ./boot.jl:331
[9] exec_options(::Base.JLOptions) at ./client.jl:272
[10] _start() at ./client.jl:506
Test Summary: | Error Total
MulticlassPerceptronClassifier (tests MLJ interface) | 2 2
n=100, p=2, n_classes=2 | 1 1
n=100, p=10, n_classes=3 | 1 1
ERROR: LoadError: Some tests did not pass: 0 passed, 0 failed, 2 errored, 0 broken.
in expression starting at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:39
ERROR: Package MulticlassPerceptron errored during testing
looks to me like you have a predict
that should be _predict
Okay, disregard that change I suggested and send me a link to the branch with MLJBase -> MMI replacements. I'll have a look on Monday.
Branch MLModelinterface
has the mentioned change
https://github.com/davidbp/MulticlassPerceptron.jl/tree/MLModelinterface
I just tested in the master branch the interface of MLJ and it seems to work.
The notebook notebook/perceptron_examples
might show you how to use it.
Okay, please see #4. This main issue was an old Tables [compat] entry.
Note that prior to registration with julia, you will need [compat] entries for every pkg not in the std library.
Also, I guess you really do need Flux, yes? This is a very large dependency, so I thought I would just check.
Also, I guess you really do need Flux, yes? This is a very large dependency, so I thought I would just check.
No the package does not need Flux for anything. I just have it to facilitate testing with MNIST.
For packages that are for testing-only should be moved from [deps] in the Project.toml to [extras], with a corresponding entry under [targets]:
[deps]
ARealDependency = "oaihjgadjffglajdflj9438759hfih"
AnotherRealDependency = "oaihjgadjffglajdflj9438759hfih"
[extras]
ATestOnlyDep = "allaskdjflkasjfljo485u"
AnotherTestOnlyDep = "lafjlakkdjfjkadf"
[targets]
test = ["ATestOnlyDep", "AnotherTestOnlyDep"]
It's probably a good idea to minimise test-only dependencies because they can lead to more maintenance issues (package conflicts appearing when other packages are updated). If you are comparing a result to Flux, I suggest running Flux manually once, and copying the results into your test code, or writing them to a file you read from in the test.
However, as I don't see any Flux testing in runtests.jl, you could just dump it for now.
However, as I don't see any Flux testing in runtests.jl, you could just dump it for now.
Yes, I do not use Flux for anything (besides using MNIST).
I attach here the results comparing sklearn vs this implementation as @tlienart commented it would be relevant for reference (if this implementation is slower there is not much benefit in ever integrating it to MLJ). Good results though. It is faster and achieves better results because it allows the "average perceptron trick" which sklearn does not support. This was tested with 100 epochs and infinite patience to ensure both models actually make "the same number of epochs" (to avoid one implementation stopping learning because the results are not improving).
#Sklearn Perceptron MNIST
train time 75.68784284591675 seconds
Accuracy train 0.866
Accuracy test 0.8565
# MulticlassPerceptronCore
train time 17.094 seconds
Train accuracy 0.937
Test accuracy 0.925
Great! A more interesting comparison would be with Flux. You could use the MLJ wrapper for testing the MNIST data set. There is already code you could modify here:
Just substitute builder=MLJFlux.Linear(σ=...)
to get a perceptron.
In any case, it looks like you're ready to create a PR for MLInterface branch onto master and register the package??
Code suggestions
num_features_and_observations
can just bereverse(size(X))
and the other onesize(X)
, also given how you use it, it's probably best to just usesize
itself but YOLOMulticlassPerceptronClassifier
object use the elemnt type as a parametric type and not carry around the element typereplace with
fit
beforepredict
fit
y::CategoricalArray
(this is done upstream)"training"
Why are there two implementations of
predict
?Tests
You currently don't have tests, I would suggest removing all notebooks and converting them in tests but even if you keep notebooks, please add rigorous tests, we will only add tested packages to the registry.
In the tests, please test the interface (again have a look at MLJLinearModels for examples)
In the tests please also consider benchmarking against sklearn's MLP implementation
Next step
Once all that is done, the steps are: