JuliaAI / MLJBase.jl

Core functionality for the MLJ machine learning framework
MIT License
160 stars 45 forks source link

Document `range` scales better #946

Open ParadaCarleton opened 11 months ago

ParadaCarleton commented 11 months ago

Most notably, several keywords (unit, origin) are completely undocumented. I also think that scale could be improved by using an ADT, like you'll find in MLStyle.jl; this has the advantages of being extensible, more cleanly documented (individual scales can be documented separately) and allowing for compile-time checks in the future.

ablaom commented 11 months ago

Agreed, this could do with some improvement.

The range extensions are provided by MLJBase, so transferring this issue to there.

unit and origin are used when fitting pdfs to a range; precisely how they are used is presently documented in (our extension to) Distributions.fit:


Distributions.fit(D, r::MLJBase.NumericRange)

Fit and return a distribution d of type D to the one-dimensional range r.

Only types D in the table below are supported.

The distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.

Distribution type D Characterization of d0
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight minimum(d) = r.lower, maximum(d) = r.upper
Normal, Gamma, InverseGaussian, Logistic, LogNormal mean(d) = r.origin, std(d) = r.unit
Cauchy, Gumbel, Laplace, (Normal) Dist.location(d) = r.origin, Dist.scale(d) = r.unit
Poisson Dist.mean(d) = r.unit

Here Dist = Distributions.

ablaom commented 11 months ago

I also think that scale could be improved by using an ADT, like you'll find in MLStyle.jl;

Not sure I understand the referenced to MLStyle.jl. Can you be more specific?

I'd say forcing style values to subtype some abstract type is an optimisation not likely to justify a breaking change at this point in time. And I think allowing it to be any callable (in addition to Symbol) is very convenient.

On the other hand, if someone had some very substantial extensions and mind, and was prepared to make the relevant contributions ...