Pandas >= 2.0.0 package compliance audit

aaraney commented 1 year ago

It was brought to my attention by a colleague that pandas 2.0.0 was released yesterday. Ive not thoroughly checked yet to see if this affects us or not. So far ive just run nwis_client, nwm_client, and metric's unit tests using pandas==2.0.0 on python 3.9.16.

jarq6c commented 1 year ago

For reference:

https://pandas.pydata.org/docs/whatsnew/v2.0.0.html

It looks like they changed a lot of implicit behavior to be more explicit. We were already moving in that direction, so hopefully any ill-effects will be minimized. I think the hydrotools code base relied on pandas library equivalents to standard library structures (e.g. pandas.Timestamp vs datetime.datetime) where possible. I think explicit index slicing has also been minimized in favor of data-centric DataFrame querying and parsing. So, hopefully necessary changes will be minimal. The biggest change I'm concerned about is the default copying behavior. That's probably worth a closer look since optimizing memory-usage has been a priority.

As we discover issues, I would lobby for trying to find solutions that are as "pandas-native" as possible and align with the expectations of the pandas library as much as possible. This approach might help minimize issues with future major pandas releases.

aaraney commented 1 year ago

The biggest change I'm concerned about is the default copying behavior. That's probably worth a closer look since optimizing memory-usage has been a priority.

Likewise. I will try to get around to running some memory benchmarks using our test suite as a naive first look to comparison this afternoon / tomorrow morning and report what my findings here.

jarq6c commented 1 year ago

So far, it looks like all tests are passing on pandas==2.0.2 and FWIW it appears that python==3.10 may be finishing faster than 3.8 and 3.9

aaraney commented 1 year ago

Thanks for digging through the logs, @jarq6c! Given that we arent experiencing any issues thus far, I am going to close this. We can always reopen if needed in the future.

NOAA-OWP / hydrotools

Pandas >= 2.0.0 package compliance audit #213