davidbp / MulticlassPerceptron.jl

MulticlassPerceptron.jl
GNU General Public License v3.0
3 stars 2 forks source link

MLJ interface #1

Open tlienart opened 4 years ago

tlienart commented 4 years ago

Code suggestions

    if Xnew isa AbstractArray
        Xnew  = MLJBase.matrix(Xnew)
    elseif Tables.istable(Xnew) 
        Xnew  = MLJBase.matrix(Xnew, transpose=true)
    end

replace with

if Tables.istable(Xnew)
  Xnew = ...
end

Why are there two implementations of predict ?

Tests

You currently don't have tests, I would suggest removing all notebooks and converting them in tests but even if you keep notebooks, please add rigorous tests, we will only add tested packages to the registry.

In the tests, please test the interface (again have a look at MLJLinearModels for examples)

In the tests please also consider benchmarking against sklearn's MLP implementation

Next step

Once all that is done, the steps are:

davidbp commented 4 years ago

Thank you for your notes,

I have some issues with respect to testing (with respect to the implementation of the learning algorihtm).

In the case of regression models you can use a closed form solution such as θ = (X1'X1 + λ*I)\(X1'y). This does not exist (as far as I know) for a multiclass perceptron.

A possible solution could be to generate a bunch of datasets of learning problems, ensure that they are linearly separable, and test that the model can get a perfect accuracy on those problems. What do you think about this possible solution? I actually opened an issue on MLJ for having a piece of code to generate datasets. I haven't done this yet but I would like to have something similar to this.

With respect to testing vs MLP in sklearn I guess you mean with the Perceptron implementation (not the MultilayerPerceptron) of sklearn (the implementation in this repo has no hidden units). This is not an apple to apple comparisson but it could give some confidence.

I have seen in the logistic case you use: @test mcr ≤ 0.2. Maybe I could train in the same dataset a perceptron version of sklearn and get a threshold?

tlienart commented 4 years ago

A possible solution could be to generate a bunch of datasets of learning problems, ensure that they are linearly separable, and test that the model can get a perfect accuracy on those problems. What do you think about this possible solution? I actually opened an issue on MLJ for having a piece of code to generate datasets but I did not code it.

Yes a first sanity check is to generate points in a low dimensional spaces that are very separated and make sure they're predicted to be in the right class with high accuracy

More generally you want two kinds of tests:

  1. that the algorithm does what it's supposed to; indeed there's stochasticity but you should get a good idea, and comparing with Sklearn is an excellent way to do this to get comparable benchmarks
  2. that the stuff you expose to the user "just works" (constructors, keywords, ...) this is unrelated to ML and is purely deterministic

I don't completely follow your point about hidden units.

Basically yes compare to Sklearn's perceptron which can also do multiclass by using one-vs-all type scoring. Definitely first start with the binary case and make sure you get comparable performances in several scenarios, then move on to the multiclass one. Check that your performances are within a factor of 2-3 of the python implementation, ideally they would be faster of course but let's say if it's 5x slower then we wouldn't really want to add that to the registry. So yeah testing+benchmarks if that makes sense.

Thanks!

davidbp commented 4 years ago

Since I haven't registered any package in julia I am not sure I have properly done this. I can see I have generated a pull request here:

https://github.com/JuliaRegistries/General/pull/6370

Now I need to make the issue to MLJModels

ablaom commented 4 years ago

It looks good to me. Because this is a new package, it may take a few days. I think there is a stand-down period of some days for new packages to allow comment. Future versions are usually auto-merged within half an hour, assuming they meet the auto-merge criterion.

azev77 commented 4 years ago

@davidbp are you still interested in including this MLP in MLJ?

davidbp commented 3 years ago

@azev77 it´s been quite a while, I would like to finish the interface now.

ablaom commented 3 years ago

Let me know if I can assist.

ablaom commented 3 years ago

https://alan-turing-institute.github.io/MLJ.jl/dev/quick_start_guide_to_adding_models/

azev77 commented 3 years ago

Now that MLJFlux is working, don’t we have access to MLPs via Flux? @ablaom

ablaom commented 3 years ago

Yep.

davidbp commented 3 years ago

I have updated the tests with an example of a linearly separable case with several classes to verify that the model can separate it perfectly (as expected). Also notebooks/learning_tests.ipynb shows graphically what we would expect from this model.

There is only something that I do not like, @ablaom maybe you can suggest how to solve this. I have a simple struct MulticlassPerceptronCore with a fit and predict methods. Since I import methods with the same name in the "Core version" and the "MLJ version" they conflict and I have to use MLJbase.predict and MulticlassPerceptron.predict.

This can be seen in the test cases:

@testset "MulticlassPerceptronCore (tests model implementation)" begin
    @testset "n=100, p=2, n_classes=2, sparse=false, f_average_weights=true" begin
        Random.seed!(6161)
        n, p, n_classes, sparse, f_average_weights = 100, 2, 2, false, true
        X, y = MulticlassPerceptron.make_blobs(n; centers=n_classes, random_seed=4, return_centers=false)
        X = copy(X')

        perceptron_core = MulticlassPerceptronCore(Float64, n_classes, p, sparse)
        fit!(perceptron_core, X, y; n_epochs=100, f_average_weights=true)
        ŷ = MulticlassPerceptron.predict(perceptron_core, X)
        @test length(ŷ)==length(y)
        @test mean(ŷ .== y)>0.95
    end
end

If I write the same code as before using simply predict it does not work.

ablaom commented 3 years ago

Yes, if you are importing both predicts into the same namespace, then you need to qualify them. I don't see that this a big deal. In practice, if your model is registered in MLJ then, in a normal MLJ workflow, the user will never do using MulticlassPerceptron and hence create the conflict of MulticlassPerceptron.predict with MLJ.predict = MLJ.MLJModelInterface.predict. In the standard workflow, only the type MulticlassPerceptronClassifier gets pulled into the user's namespace, for that is all they will need.

FYI MLJ finds this type from the load_path trait (which looks good to me) and you can even drop export MulticlassPerceptronClassifier if you want.

If the conflict really bothers you, you can overload MLJModelInterface.predict in your core code instead of overloading StatsBase.predict, which is what you are doing currently. So change https://github.com/davidbp/MulticlassPerceptron.jl/blob/ab8f85c1d05ba7d656ea58edff15703bbd0cbe90/src/MulticlassPerceptronCore.jl#L7 to

import StatsBase: fit!
import MLJModelInterface.predict

The reason MLJModelInterface does not overload StatsBase.predict is that it aims to be an ultra-lightweight package, so we don't want StatsBase as a dependency.

By the way, you ought to be able to remove MLJBase as a dependency in your code (which will speed up loading considerably, I expect) but replacing MLJBase references with MLJModelInterface references. You will still need MLJBase as a test dependency, however https://alan-turing-institute.github.io/MLJ.jl/dev/quick_start_guide_to_adding_models/#Important-1

Does this help?

davidbp commented 3 years ago

It helps a lot, thank you.

I just removed the explicit MLJBase.predict and MulticlassPerceptron.predict from the tests and with your suggested change works as expected (and all tests pass).

Neverhteless, I think I still need MLJBase in interface.jl for the following function:

function MLJModelInterface.predict(fitresult::Tuple{MulticlassPerceptronCore, MLJBase.CategoricalDecoder}, Xnew)

    # Function fit!(MulticlassPerceptronCore, X, y) expects size(X) = n_features x n_observations
    if Xnew isa AbstractArray
        Xnew  = MLJBase.matrix(Xnew)
    elseif Tables.istable(Xnew)
        Xnew  = MLJBase.matrix(Xnew, transpose=true)
    end

    result, decode = fitresult
    prediction     = predict(result, Xnew)
    return decode(prediction)
end

Since Categoricaldecoder does not seem to be in MLJModelInterface.

ablaom commented 3 years ago

That's odd. MLJModelInterface.decoder is defined. It is used in this XGBoost interface: https://github.com/alan-turing-institute/MLJXGBoostInterface.jl/blob/2e31da281e74c9532d5befac3e75dbd97cf9f3b2/src/MLJXGBoostInterface.jl#L642 which does not have MLJBase as dependency. (MMI is an alias for MLJModelInterface).

Perhaps you can provide more detail about the error you are getting.

davidbp commented 3 years ago

If I substitute all references in interface.jl of MLJBase for MLJModelInterfaceI get

ERROR: LoadError: LoadError: UndefVarError: CategoricalDecoder not defined

since MLJModelInterface.CategoricalDecoder is not found. This breaks

function MLJModelInterface.predict(fitresult::Tuple{MulticlassPerceptronCore, MLJModelInterface.CategoricalDecoder}, Xnew)

since it cannot be compiled because it depends on MLJModelInterface.CategoricalDecoder.

Note that

julia> typeof( MLJModelInterface.decoder(y[1]))
MLJBase.CategoricalDecoder{Int64,UInt32}

So it is true that MLJModelInterface.decoder exists, but MLJModelInterface.CategoricalDecoder does not exist.

ablaom commented 3 years ago

Just remove the type annotation in your predict method (and anywhere else it is not needed). Generally, function argument type annotations should be as generic as possible, which sometimes means no annotation at all. Actually, I see this particular predict is just a private method, yes? I would rename it _predict. So, something like this:

function MLJModelInterface.predict(model::MulticlassPerceptronClassifier, fitresult, Xnew)

    if Tables.istable(Xnew)
        Xnew = MLJBase.matrix(Xnew, transpose=true)
    end

    result, decode = fitresult
    prediction = _predict(result, Xnew)        # <------------ changed name to "private" name
    return decode(prediction)
end

function _predict(fitresult, Xnew)              # <----------- removed type annotation

    # Function fit!(MulticlassPerceptronCore, X, y) expects size(X) = n_features x n_observations
    if Xnew isa AbstractArray
        Xnew  = MLJBase.matrix(Xnew)
    elseif Tables.istable(Xnew)
        Xnew  = MLJBase.matrix(Xnew, transpose=true)
    end

    result, decode = fitresult
    prediction     = predict(result, Xnew)
    return decode(prediction)
end

This corresponds to lines 115 -138 at https://github.com/davidbp/MulticlassPerceptron.jl/blob/43803fbc4ab98068e2e47792228a107652123adf/src/mlj/interface.jl#L115

davidbp commented 3 years ago

I tried what you suggest (change lines 115 to 138 with the code you provide above) but the current tests for the interface.

Test Summary:                                         | Pass  Total
MulticlassPerceptronCore (tests model implementation) |    3      3
n=100, p=2, n_classes=2: Error During Test at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:41
  Got exception outside of a @test
  MethodError: no method matching predict(::Tuple{MulticlassPerceptronCore{Float32},MLJBase.CategoricalDecoder{Int64,UInt32}}, ::Tables.MatrixTable{Array{Float64,2}})
  Closest candidates are:
    predict(!Matched::Union{DeterministicNetwork, ProbabilisticNetwork}, ::Any, !Matched::Any) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/composition/composites.jl:45
    predict(!Matched::AbstractMachine, ::Any...) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/operations.jl:27
    predict(!Matched::MulticlassPerceptronCore, ::Any) at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/src/MulticlassPerceptronCore.jl:120
    ...
  Stacktrace:
   [1] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:51
   [2] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [3] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:42
   [4] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [5] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:41
   [6] include(::String) at ./client.jl:457
   [7] top-level scope at none:6
   [8] eval(::Module, ::Any) at ./boot.jl:331
   [9] exec_options(::Base.JLOptions) at ./client.jl:272
   [10] _start() at ./client.jl:506

n=100, p=10, n_classes=3: Error During Test at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:55
  Got exception outside of a @test
  MethodError: no method matching predict(::Tuple{MulticlassPerceptronCore{Float32},MLJBase.CategoricalDecoder{Int64,UInt32}}, ::Tables.MatrixTable{Array{Float64,2}})
  Closest candidates are:
    predict(!Matched::Union{DeterministicNetwork, ProbabilisticNetwork}, ::Any, !Matched::Any) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/composition/composites.jl:45
    predict(!Matched::AbstractMachine, ::Any...) at /Users/davidbuchaca1/.julia/packages/MLJBase/FFnHt/src/operations.jl:27
    predict(!Matched::MulticlassPerceptronCore, ::Any) at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/src/MulticlassPerceptronCore.jl:120
    ...
  Stacktrace:
   [1] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:65
   [2] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [3] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:56
   [4] top-level scope at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [5] top-level scope at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:41
   [6] include(::String) at ./client.jl:457
   [7] top-level scope at none:6
   [8] eval(::Module, ::Any) at ./boot.jl:331
   [9] exec_options(::Base.JLOptions) at ./client.jl:272
   [10] _start() at ./client.jl:506

Test Summary:                                        | Error  Total
MulticlassPerceptronClassifier (tests MLJ interface) |     2      2
  n=100, p=2, n_classes=2                            |     1      1
  n=100, p=10, n_classes=3                           |     1      1
ERROR: LoadError: Some tests did not pass: 0 passed, 0 failed, 2 errored, 0 broken.
in expression starting at /Users/davidbuchaca1/Documents/git_stuff/MulticlassPerceptron.jl/test/runtests.jl:39
ERROR: Package MulticlassPerceptron errored during testing
ablaom commented 3 years ago

looks to me like you have a predict that should be _predict

Okay, disregard that change I suggested and send me a link to the branch with MLJBase -> MMI replacements. I'll have a look on Monday.

davidbp commented 3 years ago

Branch MLModelinterface has the mentioned change

https://github.com/davidbp/MulticlassPerceptron.jl/tree/MLModelinterface

I just tested in the master branch the interface of MLJ and it seems to work. The notebook notebook/perceptron_examples might show you how to use it.

ablaom commented 3 years ago

Okay, please see #4. This main issue was an old Tables [compat] entry.

Note that prior to registration with julia, you will need [compat] entries for every pkg not in the std library.

ablaom commented 3 years ago

Also, I guess you really do need Flux, yes? This is a very large dependency, so I thought I would just check.

davidbp commented 3 years ago

Also, I guess you really do need Flux, yes? This is a very large dependency, so I thought I would just check.

No the package does not need Flux for anything. I just have it to facilitate testing with MNIST.

ablaom commented 3 years ago

For packages that are for testing-only should be moved from [deps] in the Project.toml to [extras], with a corresponding entry under [targets]:

[deps]
ARealDependency = "oaihjgadjffglajdflj9438759hfih"
AnotherRealDependency = "oaihjgadjffglajdflj9438759hfih"

[extras]
ATestOnlyDep = "allaskdjflkasjfljo485u"
AnotherTestOnlyDep = "lafjlakkdjfjkadf"

[targets]
test = ["ATestOnlyDep", "AnotherTestOnlyDep"]

It's probably a good idea to minimise test-only dependencies because they can lead to more maintenance issues (package conflicts appearing when other packages are updated). If you are comparing a result to Flux, I suggest running Flux manually once, and copying the results into your test code, or writing them to a file you read from in the test.

ablaom commented 3 years ago

However, as I don't see any Flux testing in runtests.jl, you could just dump it for now.

davidbp commented 3 years ago

However, as I don't see any Flux testing in runtests.jl, you could just dump it for now.

Yes, I do not use Flux for anything (besides using MNIST).

davidbp commented 3 years ago

I attach here the results comparing sklearn vs this implementation as @tlienart commented it would be relevant for reference (if this implementation is slower there is not much benefit in ever integrating it to MLJ). Good results though. It is faster and achieves better results because it allows the "average perceptron trick" which sklearn does not support. This was tested with 100 epochs and infinite patience to ensure both models actually make "the same number of epochs" (to avoid one implementation stopping learning because the results are not improving).

#Sklearn Perceptron MNIST
train time 75.68784284591675 seconds
Accuracy train 0.866
Accuracy test 0.8565
# MulticlassPerceptronCore
train time  17.094 seconds
Train accuracy 0.937
Test accuracy 0.925
ablaom commented 3 years ago

Great! A more interesting comparison would be with Flux. You could use the MLJ wrapper for testing the MNIST data set. There is already code you could modify here:

https://github.com/alan-turing-institute/MLJFlux.jl/blob/6d6f5ee33df565fe6320ba3be9f8905f81df0aac/test/image.jl#L97

Just substitute builder=MLJFlux.Linear(σ=...) to get a perceptron.

In any case, it looks like you're ready to create a PR for MLInterface branch onto master and register the package??