aleixalcacer / archetypes

Scikit-learn compatible package for archetypal analysis.
https://archetypes.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
16 stars 6 forks source link

Direct Projection onto Unit Simplex #27

Open aleixalcacer opened 3 months ago

aleixalcacer commented 3 months ago

Implement a direct projection onto unit simplex insted of an alternating projection.

See https://math.stackexchange.com/questions/2005154/how-to-project-onto-the-unit-simplex-as-intersection-of-two-sets-optimizing-a-c

wangzcl commented 1 month ago

I have finished projection (both direct one and the L1-Normalization in Morup and Hansen) and PGDs for NumPy in wangzcl/archetypes@7b3d07d93279857108be9e2141c01579c015ffef. I will explain my design a little bit later.

wangzcl commented 1 month ago

_projection.py: I wrote an abstract base class Projector and two derived subclasses, UnitSimplexProjector (for the naive algorithm 1 mentioned in Condat, 2015) and L1NormalizeProjector (first non-negative, then normalize, mentioned in Morup and Hansen). The Projector.project(x) method projects (or maps) each row of NumPy matrix x onto the unit simplex. To initialize the projector, shape of the array to project has to be provided. Call the class method Projector.from_array(x) to init the instance.

The reason for doing so is, projection requires additional memories for all kinds of intermediate arrays. If memories allocated every time we do the projection, the performance would be intolerable. Projector allocates memories for all (or most of) temporal memories used in the projection in the first place when an object is initialized. This takes advantage of the fact that in the optimization loops of one AA instance, we are dealing with arrays of fixed shapes.

The current unit simplex projection algorithm may not be the fastest on, but it is easy to implement and vectorize. However, there is still some unavoidable memory allocation in UnitSimplexProjector, which is unavoidable unless 1. applying a Python loop on ndarray; 2. rewrite in Cython or 3. switch to more complicated methods.

wangzcl commented 1 month ago

_optimize.py

The same memory allocation issue may also be in PGD processes. To improve the performance and reduce repetition in the code, I created an abstract base class PGD that defines generic behavior of PGD optimizers. To define a PGD optimizer for a specific problem, one just need to inherit PGD, defining the objective function and gradient manually (and perhaps allocate memory for intermediate arrays). I use an abstract PGD_AA class to describe some common features used in optimizing A and B in archetypal analysis, and two subclasses of this class, PGD_Optimize_A and PGD_Optimize_B to perform the PGD steps. I also defined two class decorators, @line_search to add line search in the PGD steps, and @l1_norm_proj to modify the form of gradient to fit the L1NormalizeProjector. This architecture takes advantage of Python type system's features, is easy to extend, avoids most of the repetition and runtime memory allocation.

wangzcl commented 1 month ago

The new projection and PGD has not finished yet. I haven't done any tests for the new utilities, but if you think this possible improvement works for you, I can go on finishing, testing and benchmarking all those.