GAA-UAM / scikit-fda

Functional Data Analysis Python package
https://fda.readthedocs.io
BSD 3-Clause "New" or "Revised" License
308 stars 58 forks source link

`FDataGrid.restrict` option `with_bounds` #561

Closed ego-thales closed 1 year ago

ego-thales commented 1 year ago

Implemented proposed enhancement from #560.

A few notes to keep in mind when (or before) merging:

Waiting for your feedback!

Cheers :) Élie

ego-thales commented 1 year ago

I now notice that this option potentially modifies the result of every FDataGrid.integrate call with a domain argument. This probably explains the failed tests.

One easy solution would be to set default value to False. The other option is to go through every failed test and check whether or not the difference can indeed be due to the more precise calculus.

In turn, it might be necessary to change some doc example, I'm not sure about it.

In any case, I will wait for your feedback and insights regarding this, bcause you have a much deeper view of the whole package. :)

vnmabus commented 1 year ago

The "proper" way to do it would be:

eliegoudout commented 1 year ago

Hi, Just a quick answer from my personal phone regarding other possible ways to do this.

I will argue against keeping outside points for 2 reasons:

I agree that using Tru as default value might not be polite. I thus think that if we end up deciding that it would be the best option, then the way to go surely is warning users during the next release (or more), before applying the change.

Is there a way to know more about FDataIrregular?

I think a good way to choose the defaul value will be to test the accuracy gains on plausible cases, as well as execution time lost.

vnmabus commented 1 year ago
* It will make interpolation vs. extrapolation blurry. For example, how to you extrapolate at `x = -.09`, with domain `[0, 1]` and grid points `-.1, .1, .3`? If you use `-.1` value, then you're much closer to interpolation than extrapolation, and if you don't, you lose precision.

That is true, although it is more a semantic issue. By default, the extrapolation strategy of FDataGrid delegates to interpolation. If you change it to something else, it would be applied to the points outside the domain, as usual.

* The general idea of "restricting" **IS**, imo, forgetting about outside information. Imagine wanting to test out an extrapolation method, then it is important that restriction doesn't know about outside values.

Extrapolation would be applied for the points outside the domain, in spite of having values measured for them. The default extrapolation, of course, uses these values.

Is there a way to know more about FDataIrregular?

Yes, it is a class to represent functions measured at different points (not a fixed grid). It was first proposed in #498 and a tentative implementation is in #536.

I think a good way to choose the defaul value will be to test the accuracy gains on plausible cases, as well as execution time lost.

I want to discuss also the issue with the other people involved in the project, as they may provide more insights. They are currently on vacation but we will meet again in September.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.02% :tada:

Comparison is base (6a066f7) 86.02% compared to head (df19f39) 86.05%.

:exclamation: Current head df19f39 differs from pull request most recent head e1dfe7d. Consider uploading reports for the commit e1dfe7d to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #561 +/- ## =========================================== + Coverage 86.02% 86.05% +0.02% =========================================== Files 147 147 Lines 11660 11678 +18 =========================================== + Hits 10031 10049 +18 Misses 1629 1629 ``` | [Files Changed](https://app.codecov.io/gh/GAA-UAM/scikit-fda/pull/561?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM) | Coverage Δ | | |---|---|---| | [skfda/representation/grid.py](https://app.codecov.io/gh/GAA-UAM/scikit-fda/pull/561?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM#diff-c2tmZGEvcmVwcmVzZW50YXRpb24vZ3JpZC5weQ==) | `86.50% <100.00%> (+0.25%)` | :arrow_up: | | [skfda/tests/test\_grid.py](https://app.codecov.io/gh/GAA-UAM/scikit-fda/pull/561?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GAA-UAM#diff-c2tmZGEvdGVzdHMvdGVzdF9ncmlkLnB5) | `99.36% <100.00%> (+0.04%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

vnmabus commented 1 year ago

@all-contributors please add @ego-thales for his code contributions and ideas.

allcontributors[bot] commented 1 year ago

@vnmabus

I've put up a pull request to add @ego-thales! :tada: