neurospin / pylearn-parsimony_history

Sparse and Structured Machine Learning in Python
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Refactoring project file structure #15

Closed duchesnay closed 10 years ago

duchesnay commented 10 years ago

Refactoring functions.py

functions.py is too big: it hard to understand with a low modularity that will cause many merge conflicts. I propose to refactor this file in order to increase the project modularity. My proposition,

=> functions.py Function(object): AtomicFunction(Function): CompositeFunction(Function): MultiblockFunction(CompositeFunction): Regularisation(object): Constraint(object): ProximalOperator(object): MultiblockProximalOperator(object): NesterovFunction(object): Continuation(object): Gradient(object): MultiblockGradient(object): Hessian(object): LipschitzContinuousGradient(object): GradientStep(object): GradientMap(object): DualFunction(object): Eigenvalues(object): AnonymousFunction(AtomicFunction):

=> loss/functions.py RidgeRegression(CompositeFunction, Gradient, LipschitzContinuousGradient, TO BE ADDED (Ridge?)LogisticRegression(CompositeFunction, Gradient, LipschitzContinuousGradient, QuadraticConstraint(AtomicFunction, Gradient, Constraint):

=>multiblock/functions.py RGCCAConstraint(QuadraticConstraint):

=> sparse/functions.py L1(AtomicFunction, Constraint, ProximalOperator): SmoothedL1(AtomicFunction, Constraint, NesterovFunction, Gradient,

=> tv/functions.py TotalVariation(AtomicFunction, NesterovFunction, Gradient, RR_L1_TV(CompositeFunction, Gradient, LipschitzContinuousGradient, SmoothedL1TV(AtomicFunction, Regularisation, NesterovFunction, RR_SmoothedL1TV(CompositeFunction, LipschitzContinuousGradient,

=> multiblock/functions.py LatentVariableCovariance(MultiblockFunction, MultiblockGradient): GeneralisedMultiblock(MultiblockFunction, MultiblockGradient,

JinpengLI commented 10 years ago

Based on @duchesnay idea, I propose file structure below. The only difference is that I put functions as a directory so that we can reuse functions directory. What do you think?

=> functions/basic.py
Function(object):
AtomicFunction(Function):
CompositeFunction(Function):
MultiblockFunction(CompositeFunction):
Regularisation(object):
Constraint(object):
ProximalOperator(object):
MultiblockProximalOperator(object):
NesterovFunction(object):
Continuation(object):
Gradient(object):
MultiblockGradient(object):
Hessian(object):
LipschitzContinuousGradient(object):
GradientStep(object):
GradientMap(object):
DualFunction(object):
Eigenvalues(object):
AnonymousFunction(AtomicFunction):

=> functions/loss.py
RidgeRegression(CompositeFunction, Gradient, LipschitzContinuousGradient,
TO BE ADDED
(Ridge?)LogisticRegression(CompositeFunction, Gradient, LipschitzContinuousGradient,
QuadraticConstraint(AtomicFunction, Gradient, Constraint):

=> functions/multiblock.py
RGCCAConstraint(QuadraticConstraint):

=> functions/sparse.py
L1(AtomicFunction, Constraint, ProximalOperator):
SmoothedL1(AtomicFunction, Constraint, NesterovFunction, Gradient,

=> functions/tv.py
TotalVariation(AtomicFunction, NesterovFunction, Gradient,
RR_L1_TV(CompositeFunction, Gradient, LipschitzContinuousGradient,
SmoothedL1TV(AtomicFunction, Regularisation, NesterovFunction,
RR_SmoothedL1TV(CompositeFunction, LipschitzContinuousGradient,

=> functions/multiblock.py
LatentVariableCovariance(MultiblockFunction, MultiblockGradient):
GeneralisedMultiblock(MultiblockFunction, MultiblockGradient,
tomlof commented 10 years ago

I have made a suggestion for the directory structure that is based on the suggestions of @duchesnay and @JinpengLI.

The main difference is that abstract classes that thus only define API specifications are put in a module called interfaces. Implemented code (such as ridge regression, logistic regression and so on) is in the module objectives, and penalties (L1 etc.) are put in the module penalties.

Also, I have split the functions in three parts: the main part (not named), nesterov and multiblock. This makes it very easy if we want to add functions, and does not clutter the basics if we want to add some oddball functions.

Please see my suggestion in the multiblock branch.

tomlof commented 10 years ago

Per @duchesnay 's comments, I renamed objectives to losses and functions to objectives.

Otherwise it's the same.

duchesnay commented 10 years ago

+1 for the refactoring proposed in multiblocks branch. Let's merge with master branch

tomlof commented 10 years ago

Hehe, yeah, I happened to be working in that branch, so that's where it ended up ;-) But the multiblock stuff does not interfere with anything else, so don't worry ;-)

By the way, we should do a similar refactoring for algorithms.

I was about to rebase when I realised that we must update the unit tests before we put it in the main branch. Could you look at that @JinpengLI ?

JinpengLI commented 10 years ago

OK. I will fix unit tests by 19/02/2014. I am busy this week. If it is too late, please let me know.

tomlof commented 10 years ago

@JinpengLI No problems. I think it will not take a long time. Please push the changes to the multiblock branch.

tomlof commented 10 years ago

This has been merged with the master branch.