Closed sfriedowitz closed 2 years ago
I'm not super familiar with the python release process. Do we need a version bump on that now, or only when we cut a Lolo release?
Bump the lolopy version as part of this PR. Then when you cut the release on GitHub, it sees that the version has been bumped and pushes to PyPi
@bfolie I rebased this into a feature branch I just created.
Other than the one comment about changing TrianingData[I, L]
to TrainingData[L]
, this LGTM
The main changes are as follows:
Learner[T]
which trains on atrainingData: Seq[(Vector[Any], T)]
. It is now type specific for the label type of data. This is accompanied by havingMultiTaskLearner <: Learner[Vector[Any]]
.TrainingResult[+T]
where again the type parameter is the type of the label data.Model[+T]
take as type parameter the type of labeled not. Previously, it wasModel[+T <: PredictionResult[Any]]
was parameterized by the type of prediction results instead of the labels, which added confusion to the interfaces IMO.lolopy
classes based on their learnings tasks. We now have a specificRandomForestRegressor
,RandomForestClassifier
, etc.... This was done to avoid having a highly parameterized interface presented to Python, which I suspect would have caused issues.One concern I have about these changes is how to handle the generics for classification. Internal to trees, we use
Splitter[Char]
andLearner[Char]
because we encode all classification data as typeChar
. However, the higher level classes still extendLearner[Any]
, such asClassificationTree <: Learner[Any]
andRandomForestClassifier <: Learner[Any]
. This works as far as everything compiles, but it feels a bit odd insofar as "classification" is being performed on different label types throughout the package.TODOS:
TrainingRow[T]
interface to avoid passing around big tuples ofSeq[(Vector[Any], T, Double)]
in various places. This should be a separate PR.Bagger[T]
andFeatureRotator[T]
into task-specific classes, and trying to subsume shared functionality in helper functions rather than jamming it into the same generic class.