SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
633 stars 283 forks source link

Extract time cell points as python datetimes for plotting #4737

Open cgsandford opened 2 years ago

cgsandford commented 2 years ago

✨ Feature Request

Recently iris time.cell.point calls return cftime instances rather than python datetime instances. This is due to the underpinning change in behaviour of the cftime.num2date function on which the cell functionality relies (the relevant code is in iris._coords.py). It would be easier for plotting purposes if this were changed to return a python datetime by default (eg using cftime.num2pydate), or given the option of returning one of these two datatypes.

Motivation

I'm always frustrated when I try to make a timeseries plot, since:

time_points = [cell.point for cell in cube.coord("time").cells()]
...
plt.scatter(time_points, cube.data[4])

raises a matplotlib TypeError, even though matplotlib is capable of handling python datetimes.

Additional context

None really. It's in the iris coords file and looks straightforward to fix, just a question of whether it has undesirable impacts.

rcomer commented 2 years ago

Hi @cgsandford, can you get what you want using

tcoord = cube.coord("time")
time_points = tcoord.units.num2pydate(tcoord.points)

?

Or this might work

import iris.plot as iplt

iplt.scatter(cube.coord("time"), cube[4])
cgsandford commented 2 years ago

Hi @rcomer , yes the first one works. It's just that mostly I'm doing analysis rather than plotting, so I forget about this, and every time cell.point doesn't work I spend an hour or so digging for this forgotten error on Google / Yammer / Teams.

It's a feature request because lots of things work smoothly in iris, and this doesn't, and I couldn't see a good reason why it shouldn't. If it's problematic to change, an alternative might be to document this behaviour more clearly in the iris docs (which I also looked at and couldn't see any timeseries examples or clear reason why this wouldn't work).

trexfeathers commented 2 years ago

@cgsandford sorry to hear you're experiencing problems. You're right that Iris' use of cftime datetimes isn't documented, but then again this isn't something that's expected to cause problems in user scripts.

As noted, using Iris' own plotting functions sorts out Matplotlib (because Iris uses nc-time-axis). What other datetime incompatibilities are you experiencing, beyond Matplotlib?

trexfeathers commented 2 years ago

Or perhaps I've misunderstood:

I'm doing analysis rather than plotting

And you mean that your problems are specifically with plotting, but this is particularly difficult due to unfamiliarity with plotting?

rcomer commented 2 years ago

As noted, using Iris' own plotting functions sorts out Matplotlib (because Iris uses nc-time-axis).

Note that Iris will convert to datetime.datetimes for plotting if it finds the "gregorian" calendar, otherwise it uses nc-time-axis as you say: https://github.com/SciTools/iris/blob/4c43fc6e8a692aa637e1becf6415cd4f4f7d6475/lib/iris/plot.py#L590-L594

cgsandford commented 2 years ago

Or perhaps I've misunderstood:

I'm doing analysis rather than plotting

And you mean that your problems are specifically with plotting, but this is particularly difficult due to unfamiliarity with plotting?

Yes, my analysis is fine, it's just plotting I encounter issues with.

trexfeathers commented 2 years ago

Thanks for clarifying, @cgsandford.

As noted, Iris' default use of cftime datetimes isn't expected to cause issues at the user level, hence this not being documented. If you use Iris' plotting modules (recommended here) these specialised datetimes are handled automatically. On this gallery page you can see that no specific lines are required for the time coordinates to be automatically formatted correctly on the x-axis.

With this information in mind, is there still a need for you to use the cells() method and/or to work exclusively with Python datetimes in your workflow?

rcomer commented 2 years ago

Looking at the docstrings for the coordinate cell and cells methods, they don’t actually mention anything about the conversion to datetimes. It’s not obvious that a cell point should be of different type to its respective coordinate point - in fact this was a change at Iris v2. So it seems sensible to me to at least update those docstrings to state we get cftime.datetimes for the points and bounds.

However I wonder if the root of @cgsandford’s problem is that the Sphinx search is not very clever and only finds exact word matches. So you get very different results if you search timeseries vs time series. One workaround is to use an external search engine. For plotting examples I tend to go straight to the gallery and browse for something that look vaguely similar to what I want to do. That section of the User Guide that @trexfeathers linked might benefit from a link to the gallery.

trexfeathers commented 2 years ago

@bsherratt has provided a use case when a cftime datetime is definitely NOT appropriate: when working with time-zones (which cftime can't represent). In such cases the preferred method is to manually construct the datetime objects starting with the raw numbers from Coord.points.

cgsandford commented 1 year ago

Noting down yet another instance where this has caused me issues - this time in designing unit tests. Thankfully this issue now comes up as an early hit in a Google search to obtain the workaround.

cgsandford commented 1 year ago

And again ... you'd think I'd have memorised this by now but nope!

ESadek-MO commented 1 year ago

Hey @cgsandford.

Would you fancy bringing this to the UK Met Office AVD Surgery so we can talk in more detail?

edmundhenley-mo commented 11 months ago

@cgsandford sorry to hear you're experiencing problems. You're right that Iris' use of cftime datetimes isn't documented, but then again this isn't something that's expected to cause problems in user scripts.

As noted, using Iris' own plotting functions sorts out Matplotlib (because Iris uses nc-time-axis). What other datetime incompatibilities are you experiencing, beyond Matplotlib?

Hey @trexfeathers / all - just ran into this myself in a user script (mine!) - not 1st time, but 1st time I've dug enough to find this. Mentioning to add a space weather use-case. Happy to discuss further if useful!

My specific use-case/context: using iris to load in a netCDF file with auroral data, then plotting with cartopy.

Issues arose as I updated code:

More details/actual code in commit message and linked code in a178520 (MO-only repo, commit should be visible just above this comment to MO people with access to MO GH org, sorry anyone else). Pseudo-code below.

Expand for detail on pseudocode & changes I had to make ```python prob_cube = iris.load_cube(pathfile, "probability of aurora being observable") times = prob_cube.coord("time") dts = times.units.num2date(times.points) # snip all the plotting setup for dt in dts: # fill_dark_side: see old cartopy gallery link fill_dark_side(ax, time=dt, color='black', alpha=0.75) ``` Change I was trying to make, which failed: ```diff --- fill_dark_side(ax, time=dt, color='black', alpha=0.75) +++ ax.add_feature(Nightshade(date=dt, color="black", alpha=0.75)) ``` Change I had to make, which worked: ```diff --- dts = times.units.num2date(times.points) +++ dts = times.units.num2pydate(times.points) --- fill_dark_side(ax, time=dt, color='black', alpha=0.75) +++ ax.add_feature(Nightshade(date=dt, color="black", alpha=0.75)) ```

As per comment in my commit, I guess this could be classed as a cartopy issue - e.g. "Extend Nightshade to handle cftimes".

But IMO @rcomer is bang on the money re root cause here being docs and discoverability of all this.

Expand for my pitch for why discoverability in iris docs is currently an issue, and some tentative ideas on options for improving Rationale for why I think this is an issue, ideally at least partially addressable in iris docs: * I'd no idea of [cftime's num2pydate](https://unidata.github.io/cftime/api.html#cftime.num2pydate) until I came to this issue (ta @rcomer - I used this to fix!) - while I've done many deep dives into iris' API, I've not done same for cftime (maybe my mistake!) * IMO would be lovely to have some top-level iris docs on time shenanigans * Could be quite light-touch, along lines of "By default, iris uses cftimes. If you need datetimes, you can convert a cftime to datetime using..." * Say this as my 1st hit on google (before found this issue) involved [str-based cftime <> datetime conversions](https://stackoverflow.com/a/69728631), which seems a bit wrong! Currently in iris docs (noting @rcomer's point ^^^ re sphinx searchability issues!) there's *apparently*: * [no hits for num2pydate](https://scitools-iris.readthedocs.io/en/v3.7.0/search.html?q=num2pydate) * [only 1 hit for num2date](https://scitools-iris.readthedocs.io/en/v3.7.0/search.html?q=num2date#) * *??? - I **def** know about using num2date - it's in loads of my code* * *num2date is in enough of my code that in principle I *guess* it's possible that I originally found out about the num2date incantations I use via other routes @cgsandford mentions or [RTSL](https://blog.codinghorror.com/learn-to-read-the-source-luke/), and that ever since I've been doing old-code-look-up each time I want to know "how do I do that again?" when writing new code needing datetimes.* * *Honestly though this seems unlikely - normally my discovery route for all things iris-y are your very fine docs!* * *So much so that this is definitely my 1st & preferred goto when writing new code, far ahead of "look up old code"!* * *Edit: Hmm. I just did a sanity check, and can't see num2date mentioned in most obvious place in docs I'd have learned about it / be looking for it: [User Guide > Subsetting a cube > Constraining on time](https://scitools-iris.readthedocs.io/en/latest/userguide/subsetting_a_cube.html#constraining-on-time)* * *So I **guess** it's possible that to date (!), num2date has been exception to prove rule on my "I use iris docs rather than old code" point above*
edmundhenley-mo commented 11 months ago

BTW, just in case I've excluded this prematurely as an option here, just to say that when I was 1st debugging this, I was hunting for a .to_datetime() method on the cftime object (dt below), so that I could simply change my Nightshade call in pseudocode ^^^, leaving iris manipulations out of it, something like

---    ax.add_feature(Nightshade(date=dt, color="black", alpha=0.75))
+++    ax.add_feature(Nightshade(date=dt.to_datetime(), color="black", alpha=0.75))

This doesn't seem to be an option: there's no public to_datetime method on the cftime object. However it does have a really tempting private _to_real_datetime method, which quacks right - see below. The only reason I've not used this is that leading "I'm private - don't use me!" underscore.

So far I've assumed not worth raising an issue in cftime seeking to get a public method counterpart/interface to _to_real_datetime (or checked previously-raised cftime issues) - I've assumed that there will be good cftime design reasons for this being private.

I mention here however only in case you think my assumptions wrong, as this was my 1st thought on how to address this - and doesn't put so much onus on iris (though IMO extension of iris docs here still useful)!

Expand to see there's no obvious public method in cftime object - but a tempting `_to_real_datetime` private method From my `%debug` session: ```python ipdb> dt cftime.DatetimeGregorian(2018, 5, 9, 10, 10, 0, 0, has_year_zero=False) ipdb> dir(dt) # no obvious `.to_datetime` method ['__add__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__pyx_vtable__', '__radd__', '__reduce__', '__reduce_ex__', '__repr__', '__rsub__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '_dayofwk', '_dayofyr', '_to_real_datetime', 'calendar', 'change_calendar', 'datetime_compatible', 'day', 'dayofwk', 'dayofyr', 'daysinmonth', 'format', 'fromordinal', 'has_year_zero', 'hour', 'isoformat', 'microsecond', 'minute', 'month', 'replace', 'second', 'strftime', 'strptime', 'timetuple', 'toordinal', 'tzinfo', 'year'] ipdb> aa = dt._to_real_datetime() # being naughty & trying private method ipdb> type(aa) # no cigar: not desired datetime.datetime ipdb> dir(aa) # Mind you it quacks right - cf actual datetime below. Worth asking cftime to unprivate _to_real_datetime? ['__add__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__radd__', '__reduce__', '__reduce_ex__', '__repr__', '__rsub__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__weakref__', 'astimezone', 'combine', 'ctime', 'date', 'day', 'dayofwk', 'dayofyr', 'daysinmonth', 'dst', 'fold', 'fromisocalendar', 'fromisoformat', 'fromordinal', 'fromtimestamp', 'hour', 'isocalendar', 'isoformat', 'isoweekday', 'max', 'microsecond', 'min', 'minute', 'month', 'nanosecond', 'now', 'replace', 'resolution', 'second', 'strftime', 'strptime', 'time', 'timestamp', 'timetuple', 'timetz', 'today', 'toordinal', 'tzinfo', 'tzname', 'utcfromtimestamp', 'utcnow', 'utcoffset', 'utctimetuple', 'weekday', 'year'] ipdb> import datetime ipdb> dir(datetime.datetime.utcnow()) ['__add__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__radd__', '__reduce__', '__reduce_ex__', '__repr__', '__rsub__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', 'astimezone', 'combine', 'ctime', 'date', 'day', 'dst', 'fold', 'fromisocalendar', 'fromisoformat', 'fromordinal', 'fromtimestamp', 'hour', 'isocalendar', 'isoformat', 'isoweekday', 'max', 'microsecond', 'min', 'minute', 'month', 'now', 'replace', 'resolution', 'second', 'strftime', 'strptime', 'time', 'timestamp', 'timetuple', 'timetz', 'today', 'toordinal', 'tzinfo', 'tzname', 'utcfromtimestamp', 'utcnow', 'utcoffset', 'utctimetuple', 'weekday', 'year'] ``` As it's a private method, I'm not being naughty and doing following, even though looks like it would work: ```diff --- ax.add_feature(Nightshade(date=dt, color="black", alpha=0.75)) +++ ax.add_feature(Nightshade(date=dt._to_real_datetime(), color="black", alpha=0.75)) ```
ESadek-MO commented 10 months ago

@edmundhenley-mo @SciTools/peloton thinks this would be best discussed in person, perhaps at the next aforementioned surgery, or alternatively a dedicated meeting?

edmundhenley-mo commented 6 months ago

Eep, sorry for radio silence @ESadek-MO - just noticed I completely missed your reply at time. Yes, happy to discuss in person - will follow up separately!