Closed juliohm closed 4 years ago
@joshday I know this PR has a lot of changes to review, but I appreciate if you can take a quick look at it before we can proceed with additional improvements. Below I summarize the changes in a high level language so that you don't waste too much time reading the code. I can guarantee that the tests are passing locally as before, and that this is mainly a reorganization of the source code.
Previously the source file supervised.jl
was really large. It contained both a set of traits for the abstract types SupervisedLoss
, DistanceLoss
and MarginLoss
, multiple docstrings for various methods of value
, value!
, deriv
, deriv!
, deriv2
, and deriv2!
, and implementations for the various aggregation modes None
, Sum
, Mean
, WeightedSum
and WeightedMean
.
My changes consisted of splitting this file into various small files of digestible size for the different aggregation modes. I've also eliminated docstrings that were repeated for different methods of the same function. The plan is to move these docstrings to the LearnBase.jl interface package.
This present PR is self-contained. After it is merged I have a few more changes planned in LearnBase.jl according to https://github.com/JuliaML/LearnBase.jl/issues/35 before I can keep working in LossFunctions.jl
I've started reorganizing LearnBase.jl to facilitate future contributions. The new codebase will be split into small files for different concepts. In this case, I've already created a file for the costs.jl
and migrated all the docstrings from LossFunctions.jl there (local repo). I've also noticed that some traits have never been implemented here, which is a sign that the trait could be removed from the interface temporarily. My next immediate steps after this PR is merged are:
The documentation failure can be fixed by removing docs/Manifest.toml
; it locks a different uuid of LossFunctions
(ref: https://github.com/JuliaML/LossFunctions.jl/commit/ad125ad0749e1f99e4a82164d0eb2e631e91aa53)
Thanks, will take a look at it today. Basically I need to update Manifest.toml to look at the parent folder in dev version (] dev ..
). I think I had used the UUID from LearnBase.jl in some previous commit, but that is fixed now thanks for JuliaRegistrator that pointed the issue.
This PR is finally ready after a great amount of refactoring in LearnBase.jl and here. There are a number of code simplifications that I would like to share with you before we proceed and merge.
The interface for all types of losses is now fully defined in LearnBase.jl. This means that the docstrings for the end users (e.g. value
, deriv
, deriv2
) all live there. LossFunctions.jl only implements the concepts now, and adds methods which may depend on additional packages (e.g. GPU implementations). This also means that the concept of AggregationMode
and the submodule AggMode
now live in LearnBase.jl for reuse in other implementation packages like PenaltyFunctions.jl
The source code is now organized with the different concepts in mind so that new potential contributors can navigate more easily. In particular, we have a clear separation between loss types and meta-loss types. This refactoring also helped identify source files that need further work such as sparse.jl
which currently only contains a single implementation of a single method for margin-based losses with sparse arrays.
Scaled losses are now a single family as opposed to multiple ScaledDistanceLoss
, ScaledMarginLoss
, etc. The function scaled(l, a)
has been deprecated in favor of the simple constructor ScaledLoss(l, a)
, or alternatively a * l
syntax.
Weighted margin losses are now constructed with WeightedMarginLoss(l, w)
. Before we had a WeightedBinaryLoss
type with some typos in the constructors allowing the weighting of non-margin-based losses. We also had this floating function called weightedloss(l, w)
that has been deprecated in favor of the constructor for consistency.
The value_deriv
function has been deprecated in favor of the simple value
and deriv
pair. This small optimization was not used by any downstream package, and increases the maintenance burden. I've also deprecated the methods of value
, deriv
and deriv2
which accepted a single array with precomputed "agreement" like value(l, rand(100))
. This can be useful for internal optimizations, and we can always reintroduce these methods internally if we find the need. End users will rarely precompute margins like yt = y.*t
and then call the methods above. They will likely use non-allocating versions value(l, t, y, AggMode.Mean())
.
Other functions that could perhaps be deprecated are the mutating versions value!
, deriv!
, and deriv2!
. I didn't deprecate them in this PR because that requires more thought. I understand that we can always achieve the same non-allocating result with broadcasting @. buffer = value(l, t, y)
for normal arrays, but I am not sure this line would also work with sparse arrays for example? Because I don't have enough experience with sparse arrays in Julia, I decided to not touch the functions and maintain them.
I've also updated the documentation to reflect the latest version of the code. In order to build it successfully, you need to get the master version of LearnBase.jl from inside the docs folder:
] add LearnBase#master
] instantiate
include("make.jl")
Appreciate if you can take a second look at this PR. The plan is to merge it by the end of the weekend. Meanwhile I will be working on adding support for CategoricalArrays.jl, which is the last item necessary for this paper I am working on.
This PR moves source code around to improve the organization of concepts in the package. This exercise has already helped me understand various points of improvement such as the implementation of meta losses "scaled", "ordinal" and "weightedbinary".
There is a single breaking change here, which is the removal of the
OrdinalHingeLoss
type alias. I don't think we should export this specific combination. Users can still constructOrdinalMarginLoss{HingeLoss}
like with other losses.