Structure of JuliaML - Githubissues

tbreloff commented 7 years ago

I was thinking about the structure of JuliaML today. I don't foresee that I'll have enough bandwidth to properly complete the "tom branch on Transformations", and I'd really like to fill in the missing pieces of JuliaML. Losses seems pretty much done to me... nice work guys, and thanks @Evizero for getting LearnBase registered.

Transformations

I think Transformations could revert to being a relatively small package which defines the forward (value) and backward (deriv) methods/types for some common ML functions: logit, softsign, relu, etc, as well as some more complex transformations: affine, convolutions, pooling, ANN layers/nets, etc.

Penalties

I'm still unconvinced that this should be separate from Losses... but we can always combine later. This houses "losses on parameters". Does it include anything else?

ObjectiveFunctions

I think this could reexport Losses, Penalties, and Transformations, and provide some conveniences for dealing with common functions of all of them. An empirical risk minimization convenience would live here, for example.

I also think it would be valuable to add a dependence on one of the automatic differentiation libraries and make it easy to build a differentiable graph of the components from these packages.

StochasticOptimization

I feel like this can be our playground for "what we wish for Optim". Maybe eventually the efforts can be unified properly. I want to have conveniences for iterating, helpers for gradient calcs (Adam, etc), helpers for hyperparameter tuning/adaption, etc. I'd like to use the IterationManagers interface, and expand on some of the discussions there around composable managers/states.

Learn

The meta-package. This should install and set up the packages listed above, and probably a few more that are common: Plots/PlotRecipes, ValueHistories, MLMetrics, etc. One should be able to:

Pkg.add("Learn")
using Learn

and the entire JuliaML ecosystem is ready.

ahwillia commented 7 years ago

I think the highest priority is to figure out how to merge Losses and Penalties. The only way I see to do this is to actually try building some (simple) models people would like to fit.

Fitting some basic models

I think GLMs are a great place to start. We could focus on, for example, sparse logistic regression. In the short term, we could start with a basic API:

X, y = load_my_data()
using Learn
predictor = Affine(y, X)
loss = LogitMarginLoss()
penalty = L1Penalty(0.1)
model = learn(predictor, loss, penalty)

Down the line (maybe?)...

X, y = load_my_data()
using Learn
model = @learn logit(y) ~ B*X + 0.1*L1Penalty(X)

Here is the basic functionality I think we should tackle/play with:

[ ] GLMs (with penalties on the regression coefficients)
[ ] PCA and Non-negative matrix factorization
[ ] SVMs (to @Evizero's liking)
[ ] Softmax regression

Edit: here is some rough code for fitting LASSO with code that is similar to what we have now in ObjectiveFunctions: https://github.com/ahwillia/ProxAlgs.jl/blob/master/examples/lasso.jl

Optimization

I think we should wait and see what @pkofod does with the Optim interface. I'm pretty excited about combining the new update! interface with Tim Holy's work on reshaped array views. See a self-contained example here: https://github.com/ahwillia/CatViews.jl/blob/master/examples/pca.jl (I actually have a slightly better way to do this now, but the code there is still good in spirit)

But we will still need to add some tools in StochasticOptimization or elsewhere:

[ ] ADAM, RMSPROP, etc. (stochastic gradient descent might just work with a light wrapper to Optim?)
[ ] Proximal and projected gradient descent
[ ] ADMM
[ ] Alternating gradient descent for NMF, PCA, etc.

Many of these I think will just involve small changes to the planned functionality in Optim. For example, proximal gradient descent:

m = GradientDescent()
s = initialize_state(m, options)
while !converged
    update!(s, ... )
    prox!(s)  # this is all I added!
    ...
end

Evizero commented 7 years ago

I'm knuckles deep in a different project right now, but let me give a short generic reply that doesn't do your posts any justice. Yet, I want to bring up some points at least in the meantime.

I don't think we should merge the Losses.jl and Penalties.jl packages without good reason. They can stand alone. There is no upside I can think of. If there are strong opinions on this matter I will take whatever side @joshday chooses as he authored Penalties.jl and knows best if it would benefit from a merge.
I am no statistician, but I don't think GLM is the right term to use here, is it? The GLM framework seems quite different to me.
I fully agree that we should at least create a simple linear and affine prediction model and some glue code (aka StructuralRisk) to treat loss, penality, and predictor as a unit. This seems like the logical next step after Penalties.jl and Losses.jl are stable
I have little opinion on Optimization yet, as I can't quite picture the end result of optim and the end result of our "Risks" solution

ahwillia commented 7 years ago

I don't think we should merge the Losses.jl and Penalties.jl packages without good reason. They can stand alone.

It's hard for me to see how Penalties could stand alone. Losses I could perhaps see. Maybe we could consider something like the following down the line?

module ObjectiveFunctions
    module Losses
    ...
    end
    module Penalties
    ....
    end
end

I don't think this is high priority though - we can create some "glue code" as you say to start playing with things.

I am no statistician, but I don't think GLM is the right term to use here

Maybe @joshday can correct me, but I believe you can do e.g. logistic regression with the loss functions we have already. Regardless - I'm on board with your plan of having a "simple linear and affine prediction model" with different losses and penalties.

We need a sandbox for all of this, should we use ObjectiveFunctions or create something new? Maybe Predictors.jl?

tbreloff commented 7 years ago

I think ObjectiveFunctions and StochasticOptimization should be the sandboxes

On Saturday, August 27, 2016, Alex Williams notifications@github.com wrote:

I don't think we should merge the Losses.jl and Penalties.jl packages without good reason. They can stand alone.

It's hard for me to see how Penalties could stand alone. Losses I could perhaps see. Maybe we could consider something like the following down the line?

module ObjectiveFunctions module Losses ... end module Penalties .... endend

I don't think this is high priority though - we can create some "glue code" as you say to start playing with things.

I am no statistician, but I don't think GLM is the right term to use here

Maybe @joshday https://github.com/joshday can correct me, but I believe you can do e.g. logistic regression with the loss functions we have already. Regardless - I'm on board with your plan of having a "simple linear and affine prediction model" with different losses and penalties.

We need a sandbox for all of this, should we use ObjectiveFunctions or create something new? Maybe Predictors.jl?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaML/Roadmap.jl/issues/13#issuecomment-242925994, or mute the thread https://github.com/notifications/unsubscribe-auth/AA492k-1ZCiK44M8a7rSmZdtHnpmqBShks5qkGIpgaJpZM4JuYq7 .

joshday commented 7 years ago

@tbreloff Thanks for writing this up. This structure sounds great to me.

This houses "losses on parameters". Does it include anything else?

@ahwillia has constraints in his branch of ObjectiveFunctions that I'd like to add to Penalties.

Fitting GLMs are possible under what we have. Some of the losses are negative log-likelihoods.

It's hard for me to see how Penalties could stand alone

I think it would be rare to see someone using Penalties without Losses, but conceivably someone could do something like import GLM and Penalties to make a glmnet-type package. I'd like to keep Penalties separate for now, but I'm open to combining them later.

JuliaML / META

Structure of JuliaML #13