GAA-UAM / scikit-fda

Functional Data Analysis Python package
https://fda.readthedocs.io
BSD 3-Clause "New" or "Revised" License
309 stars 58 forks source link

Compare values taking into account the L1 norm of the vectors #479

Closed Ddelval closed 2 years ago

Ddelval commented 2 years ago

By taking into account the L1 norm of the vectors to be compared, it is possible to use smaller tolerances.

In PCA, small values in the scores or loadings have little significance. Therefore, the elements of the vectors that are small compared to the rest can be compared with a higher tolerance. This is accomplished by setting the absolute tolerance of each element to be a fraction of the L1 norm of the vector divided by the number of elements.

This change was required due to differences between scikit-fda and fda.usc when it comes to the use of numeric quadratures. In particular, fda.usc always uses Simpson's weights for the inner products (see inprod.fda.R:87 in fda.usc source), while the calculation of the principal components might use uniform weights (fdata2pc.R:429 in fda.usc source). In scikit-fda, we always use the same weights for both operations.

codecov[bot] commented 2 years ago

Codecov Report

Base: 84.96% // Head: 84.97% // Increases project coverage by +0.00% :tada:

Coverage data is based on head (435464e) compared to base (46fc684). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #479 +/- ## ======================================== Coverage 84.96% 84.97% ======================================== Files 135 135 Lines 10731 10734 +3 ======================================== + Hits 9118 9121 +3 Misses 1613 1613 ``` | [Impacted Files](https://codecov.io/gh/GAA-UAM/scikit-fda/pull/479?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM) | Coverage Δ | | |---|---|---| | [skfda/tests/test\_fpca.py](https://codecov.io/gh/GAA-UAM/scikit-fda/pull/479/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM#diff-c2tmZGEvdGVzdHMvdGVzdF9mcGNhLnB5) | `99.31% <100.00%> (+0.01%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Ddelval commented 2 years ago

@vnmabus I am not sure if assert_all_close_normalized should be defined in the same .py file since it is a functionality that might be useful in other tests. Also, I have changed all the calls to np.assert_all_close that used a custom rtol or atol value since I believe it is easier to understand how big the tolerance being used is using this normalization. However, I have not modified the calls to np.assert_all_close that use the default tolerances since those are so small that it can be understood as the arrays being practically equal. Let me know if you think this is the right approach.