pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.04k stars 287 forks source link

Replacement of deprecated `datetime.datetime.utcnow()` #2752

Open pnuu opened 3 months ago

pnuu commented 3 months ago

Feature Request

As of Python 3.12 the usage of datetime.datetime.utcnow() and .utcfromtimestamp() are deprecated and will be removed in a later version. We need to address this in Satpy (and in all the other Pytroll packages).

Describe the solution you'd like

There are some options I can think of:

  1. use naive datetimes and leave it to the user to handle problems with .now() returing time their computer is set to
  2. use naive datetimes and force UTC times for the process (os.environ["TZ"]; time.tzset())
  3. start using timezone-aware datetimes and use dt.datetime.now(dt.datetime.timezone.utc)

The first option is problematic in many ways if the computer is not in UTC, one of which is the unpredictability of the tests. Forcing local time to UTC might cause weird problems if the user wants to use real local time for some part of the processing.

So we have the third option and make all the datetimes in Satpy (and Pytroll in general) timezone-aware. This was the option we kind of agreed in the last Pytroll monthly meeting, but the attendance was pretty limited. It'll require some (lots of?) work to make sure all the datetime stuff still work as intended.

Describe any changes to existing user workflow Depends on the option. With the third option should only affect developers.

Additional context Some links that discuss the deprecation:

Ping: @djhoese @mraspaud @gerritholl @simonrp84 @ameraner @sfinkens

simonrp84 commented 3 months ago

The third option seems like the most sensible, if work-intensive, option. I can imagine the first two options causing potential unexpected issues for users with stange setups etc.

gerritholl commented 3 months ago

Agree that option three is the best alternative here. I remember now why I used option three in the ninjogeotiff writer: I was blissfully unaware of utcnow :smile:

https://github.com/pytroll/satpy/blob/2c27b35081c59e27fbc03d2f9e3df14fd8ffa215/satpy/writers/ninjogeotiff.py#L401

Depends on the option. With the third option should only affect developers.

It could affect (monitoring) tools that interpret logfiles. It could also change the format in which the CF writer writes timestamps.

pnuu commented 3 months ago

It could affect (monitoring) tools that interpret logfiles.

The log formatting can be tuned by the user if necessary. I this mostly affects Trollflow2 and the collectors.

It could also change the format in which the CF writer writes timestamps.

This shouldn't change for the users, we (the devs) should take care of maintaining that.

sfinkens commented 3 months ago

I also prefer option 3.

gerritholl commented 3 months ago

One complication with option 3: np.datetime64 does not support timezone information (it used to, but it was dropped in numpy 1.11), and xarray uses arrays of np.datetime64 for time coordinates. It is not possible to use pandas.DatetimeIndex (which does support timezones) as a time coordinate in xarray (and pandas objects can't be multidimensional). Therefore, wherever we convert between time coordinates and datetime objects, we would need to explicitly indicate that this is UTC.