JuliaML / META

Discussions related to the future of Machine Learning in Julia
MIT License
10 stars 2 forks source link

MLModels and MLTransformations #7

Closed ahwillia closed 8 years ago

ahwillia commented 8 years ago

Can we start prototyping the MLModels package? Did we decide that this package would contain both definitions of Transformations and implementations of deriv, prox, grad, etc.?

I put a prototype repo here: https://github.com/ahwillia/MLTransformations.jl

(I would be happy to prototype within JuliaML, but wasn't sure whether you would prefer me start elsewhere an migrate packages in once they are a bit more mature. I imagine I will end up deleting this repo or at the least re-naming it.)

High-level notes:

This is a flavor:

type IdentityTransform <: InvertibleTransformation end
type LogTransform <: InvertibleTransformation end
type ExpTransform <: InvertibleTransformation end
type LogisticTransform <: InvertibleTransformation end
type LogitTransform <: InvertibleTransformation end

apply!(::IdentityTransform, x) = x
apply!(::LogTransform, x) = map!(log,x)
apply!(::ExpTransform, x) = map!(exp,x)

invert!(::IdentityTransform, x) = x
invert!(::LogTransform, x) = map!(exp,x)
invert!(::ExpTransform, x) = map!(log,x)

get_inverse(::IdentityTransform) = IdentityTransform()
get_inverse(::LogTransform) = ExpTransform()

And for a "Learnable Transformation":

type Standardize{T<:Real} <: LearnableTransformation
    shift::ShiftTransform{T}
    scale::ShiftTransform{T}
end
function fit!{T}(transform::Standardize{T}, x)
    transform.shift = mean(x)
    transform.scale = one(T)/std(x)
end
function apply!(transform::Standardize, x)
    apply!(transform.shift, x)
    apply!(transform.scale, x)
end
function invert!(transform::Standardize, x)
    invert!(transform.scale, x)
    invert!(transform.shift, x)
end

Thoughts on whether this is too verbose? I'm just brainstorming here.

tbreloff commented 8 years ago

Thanks for the example code. It really helps to understand. I don't have time now for a full analysis but here's a few thoughts:

New proposal:

Reading about transform I found that 'Function' was the noun that best fit the most general abstraction, and so i thought... What if our transformations are simply functors, and you just call them with input data:

f = LogFunction() yhat = f(x) learn!(f, x, y) # no op for log

net = ANN() yhat = net(x) learn!(net, x, y)

Lots to think about... Tomorrow...

On Sunday, June 26, 2016, Alex Williams notifications@github.com wrote:

Can we start prototyping the MLModels package? Did we decide that this package would contain both definitions of Transformations and implementations of deriv, prox, grad, etc.?

I put a prototype repo here: https://github.com/ahwillia/MLTransformations.jl

(I would be happy to prototype within JuliaML, but wasn't sure whether you would prefer me start elsewhere an migrate packages in once they are a bit more mature. I imagine I will end up deleting this repo or at the least re-naming it.)

High-level notes:

  • I used apply/apply! rather than transform/transform! but I'm not attached to this. My reasoning was (a) apply is shorter, and (b) it frees you to say transform = MyTransformation(); apply!(transform, data) which seems natural to me.
  • I propose adding fit_apply and fit_apply! which is similar to SciKitLearn API
  • I propose adding something along the lines of invert!(transform,data)

This is a flavor:

type IdentityTransform <: InvertibleTransformation endtype LogTransform <: InvertibleTransformation endtype ExpTransform <: InvertibleTransformation endtype LogisticTransform <: InvertibleTransformation endtype LogitTransform <: InvertibleTransformation end apply!(::IdentityTransform, x) = xapply!(::LogTransform, x) = map!(log,x)apply!(::ExpTransform, x) = map!(exp,x) invert!(::IdentityTransform, x) = xinvert!(::LogTransform, x) = map!(exp,x)invert!(::ExpTransform, x) = map!(log,x) get_inverse(::IdentityTransform) = IdentityTransform()get_inverse(::LogTransform) = ExpTransform()

And for a "Learnable Transformation":

type Standardize{T<:Real} <: LearnableTransformation shift::ShiftTransform{T} scale::ShiftTransform{T}endfunction fit!{T}(transform::Standardize{T}, x) transform.shift = mean(x) transform.scale = one(T)/std(x)endfunction apply!(transform::Standardize, x) apply!(transform.shift, x) apply!(transform.scale, x)endfunction invert!(transform::Standardize, x) invert!(transform.scale, x) invert!(transform.shift, x)end

Thoughts on whether this is too verbose? I'm just brainstorming here.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JuliaML/Roadmap.jl/issues/7, or mute the thread https://github.com/notifications/unsubscribe/AA492ggjfb6sXZVl2VXFoh8ne3FtYJshks5qPy5lgaJpZM4I-tCb .

ahwillia commented 8 years ago

I don't necessarily see a problem with using transform as a noun in a mathematical setting (https://en.wikipedia.org/wiki/List_of_transforms). But using it as a verb would be consistent with scikitlearn, which is a plus in my book, and you're right that apply is too general.

I like the functor idea for simple transformations like logs, but maybe less so for learned transformations.

pca = PCA()
train!(pca,x1)

# option 1
y = pca(x2)

# option 2
y = transform(pca,x2)

The semantics of option 1 seem to suggest that we are doing PCA on x2 (this is what MATLAB would return). Whereas, in fact, we are training a PCA transformation on data x1 and then applying this transformation to a second dataset.

A potential path forward?

Re: train vs. fit vs. learn

Evizero commented 8 years ago

https://github.com/Evizero/MLModels.jl exists and I will move it within this week to JuliaML. Just some housekeeping left to do there that I would like to finish before

Concerning verbs: I see it is still up to debate. Let us create an issue at the new LearnBase and put it to a vote? If no one beats me to it I shall do that on the weekend

(Sorry for the short reply but this week (except Wednesday, at which I hope to work at MLModels) I will be really busy with work.)

tbreloff commented 8 years ago

We can have a generic definition:

call(t::Transformation, args...; kw...) = transform(t, args...; kw...)

Then use whichever makes more sense contextually.

Verbs we will flesh out in another thread, but I like 'learn' precisely because it's not commonly used for either stats or ML. It also makes lots of sense to 'learn' as the core activity in 'LearnBase'.

On Monday, June 27, 2016, Christof Stocker notifications@github.com wrote:

https://github.com/Evizero/MLModels.jl exists and I will move it within this week to JuliaML. Just some housekeeping left to do there that I would like to finish before

Concerning verbs: I see it is still up to debate. Let us create an issue at the new LearnBase and put it to a vote? If no one beats me to it I shall do that on the weekend

(Sorry for the short reply but this week (except Wednesday, at which I hope to work at MLModels) I will be really busy with work.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaML/Roadmap.jl/issues/7#issuecomment-228663969, or mute the thread https://github.com/notifications/unsubscribe/AA492iJqdGCAfmqK2Pb7ZB44W58db2WUks5qP23_gaJpZM4I-tCb .

Evizero commented 8 years ago

I actually had the call sugar implemented for losses but that syntax is deprecated, so I removed it. That said AFAIK there will be an alternative way to allow for myobject(...) in the future, which I agree we should make use of

ahwillia commented 8 years ago

Closing as this seems superseded by discussion here: https://github.com/JuliaML/Roadmap.jl/issues/8