[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[x] Breaking change (fix or feature that would cause existing functionality to change)
Description
The initial thrust for this PR was the need to lift unnecessary input restrictions in gtda.externals.ripser when metric='precomputed'. Namely, pairwise_distances(X, metric='precomputed') will perform input validation and reject anyX containing negative entries. But the theory of Vietoris-Rips filtrations and the C++ implementation of ripser handle these cases perfectly well.
However, simply removing the call to pairwise_distances in the case metric='precomputed' would have removed useful checks for input shape (square). The point of view was adopted that since gtda.externals.ripser is not exposed to the user via documentation, and is only meant to be called by VietorisRipsPersistence instances, these checks should in fact be moved to the input validation steps in that class, i.e. to the calls to check_point_clouds.
Hence, check_point_clouds was refactored as follows:
The parameter distance_matrix was renamed distance_matrices (plural, agreeing with check_point_clouds).
When distance_matrices is set to True, the shapes of entries are checked to be square.
Warnings have been added to guide the user to correctly setting the distance_matrices parameter.
The default behaviour of sklearn's check_array has been modified so that force_all_finite=False does not mean accepting NaN input (only infinite input is accepted). Note that ripser handles infinities quite correctly when distance/adjacency matrices are passed.
The docstrings have been improved.
Finally, comprehensive tests for check_point_clouds were created.
Types of changes
Description The initial thrust for this PR was the need to lift unnecessary input restrictions in
gtda.externals.ripser
whenmetric='precomputed'
. Namely,pairwise_distances(X, metric='precomputed')
will perform input validation and reject anyX
containing negative entries. But the theory of Vietoris-Rips filtrations and the C++ implementation ofripser
handle these cases perfectly well.However, simply removing the call to
pairwise_distances
in the casemetric='precomputed'
would have removed useful checks for input shape (square). The point of view was adopted that sincegtda.externals.ripser
is not exposed to the user via documentation, and is only meant to be called byVietorisRipsPersistence
instances, these checks should in fact be moved to the input validation steps in that class, i.e. to the calls tocheck_point_clouds
.Hence,
check_point_clouds
was refactored as follows:distance_matrix
was renameddistance_matrices
(plural, agreeing withcheck_point_clouds
).distance_matrices
is set toTrue
, the shapes of entries are checked to be square.distance_matrices
parameter.check_array
has been modified so thatforce_all_finite=False
does not mean accepting NaN input (only infinite input is accepted). Note thatripser
handles infinities quite correctly when distance/adjacency matrices are passed.Finally, comprehensive tests for
check_point_clouds
were created.Checklist
flake8
to check my Python changes.pytest
to check this on Python tests.