GUDHI / gudhi-devel

The GUDHI library is a generic open source C++ library, with a Python interface, for Topological Data Analysis (TDA) and Higher Dimensional Geometry Understanding.
https://gudhi.inria.fr/
MIT License
246 stars 65 forks source link

[Representations Module] ComplexPolynomial class does not update threshold automatically #985

Open astamm opened 9 months ago

astamm commented 9 months ago

The documentation says:

  • threshold (int) – number of coefficients (default 10). This is the dimension of the complex vector of coefficients, i.e. the number of coefficients corresponding to the largest degree terms of the polynomial. If -1, this threshold is computed from the list of persistence diagrams by considering the one with the largest number of points and using the dimension of its corresponding complex vector of coefficients as threshold.

However, when running

from gudhi.representations import ComplexPolynomial as cp
import numpy as np
dgm = np.array([[1, 4], [2, 5]])
cls = cp(threshold=-1)
cls.fit([dgm])
cls.threshold

we still get -1, while:

cls=cp()
cls(dgm)

which uses the default threshold=10, outputs:

array([ -3. -9.j, -18.+13.j,   0. +0.j,   0. +0.j,   0. +0.j,   0. +0.j,
         0. +0.j,   0. +0.j,   0. +0.j,   0. +0.j])

which shows that in the first case, threshold should have been updated automatically to 2.

mglisse commented 9 months ago

IIRC, scikit-learn policy says cls.threshold has to be the value that was passed to the constructor, we are not allowed to modify it. The normal thing to do when threshold==-1 would be to compute the max in fit and store that value under some other name. However, I notice that we do this computation inside transform, that looks strange. @MathieuCarriere, do you agree that the computation of thres should happen in fit instead of transform? (I can do it, I am just looking for confirmation) Or is it meant to do what it currently does?

@astamm: if I move the computation of thres to fit, I could document under what name it is available (maybe threshold_). If it stays in transform, we could clarify in the doc that a different value is used, not that threshold is modified.

astamm commented 9 months ago

Btw, the same holds for the TopologicalVector class.