openscm / scmdata

Handling of Simple Climate Model data
https://scmdata.readthedocs.io
BSD 3-Clause "New" or "Revised" License
8 stars 5 forks source link

feat: Add uniform year interpolation #221

Closed lewisjared closed 1 year ago

lewisjared commented 1 year ago

Assumes that each year is of equal length which can be useful in some circumstances.

This also fixes the bug where interpolate converts integer values into unix time. This functionality isn't consistent with the behaviour of the TimePoints class where integers are converted into years.

Pull request

Please confirm that this pull request has done the following:

Adding to CHANGELOG.rst

Please add a single line in the changelog notes similar to one of the following:

- (`#XX <https://github.com/openscm/scmdata/pull/XX>`_) Added feature which does something
- (`#XX <https://github.com/openscm/scmdata/pull/XX>`_) Fixed bug identified in (`#YY <https://github.com/openscm/scmdata/issues/YY>`_)
codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01 :tada:

Comparison is base (5e94e98) 95.82% compared to head (8f62f5f) 95.84%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #221 +/- ## ========================================== + Coverage 95.82% 95.84% +0.01% ========================================== Files 23 23 Lines 2084 2092 +8 Branches 370 372 +2 ========================================== + Hits 1997 2005 +8 Misses 69 69 Partials 18 18 ``` | [Impacted Files](https://codecov.io/gh/openscm/scmdata/pull/221?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openscm) | Coverage Δ | | |---|---|---| | [src/scmdata/run.py](https://codecov.io/gh/openscm/scmdata/pull/221?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openscm#diff-c3JjL3NjbWRhdGEvcnVuLnB5) | `95.98% <100.00%> (+0.04%)` | :arrow_up: | | [src/scmdata/time.py](https://codecov.io/gh/openscm/scmdata/pull/221?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openscm#diff-c3JjL3NjbWRhdGEvdGltZS5weQ==) | `100.00% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openscm). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openscm)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

lewisjared commented 1 year ago

And to add a bit more context for prosperity reasons. The previous behaviour of interpolate was to convert integers to UNIX time (seconds since 1970ish). The more consistent behaviour would be to use integers as years and floats as decimal years.

Whether this is called a bug or a breaking change is up for debate, but the behaviour as it was previously implemented (ints mapping to UNIX time) was likely not used in favour of a call that looked like run.interpolate([datetime.date(y, 1, 1) for y in years]). This can now be replaced with the much clearer run.interpolate(years).

znicholls commented 1 year ago

The more consistent behaviour would be to use integers as years and floats as decimal years

I don't fully follow. More consistent with what? Use integers as years for what?

lewisjared commented 1 year ago

We commonly use integers to represent years when creating new ScmRun objects and for filtering.

E.g. the ScmRun constructor uses integer values to represent years (specifically Jan 1 of the year) and relies upon the same datetime conversion functionality now used by interpolate (scmdata.time._format_datetime).

r = ScmRun(
        [1.0, 2.0, 3.0, np.nan],
        columns={
            "scenario": ["a_scenario"],
            "model": ["a_model"],
            "region": ["World"],
            "variable": ["Emissions|BC"],
            "unit": ["Mg /yr"],
        },
        index=[2000, 2100, 2200, 2300],
    )
assert r["year"].to_list() == [2000, 2100, 2200, 2300]

Nowhere else in the library are unix times used, which is why the behaviour is unexpected.

znicholls commented 1 year ago

Nowhere else in the library are unix times used, which is why the behaviour is unexpected.

Yep got it thanks

lewisjared commented 1 year ago

@znicholls Any other feedback or are you ok with this being merged?