Closed aaraney closed 1 year ago
For reference:
https://pandas.pydata.org/docs/whatsnew/v2.0.0.html
It looks like they changed a lot of implicit behavior to be more explicit. We were already moving in that direction, so hopefully any ill-effects will be minimized. I think the hydrotools
code base relied on pandas
library equivalents to standard library structures (e.g. pandas.Timestamp
vs datetime.datetime
) where possible. I think explicit index slicing has also been minimized in favor of data-centric DataFrame
querying and parsing. So, hopefully necessary changes will be minimal. The biggest change I'm concerned about is the default copying behavior. That's probably worth a closer look since optimizing memory-usage has been a priority.
As we discover issues, I would lobby for trying to find solutions that are as "pandas-native" as possible and align with the expectations of the pandas library as much as possible. This approach might help minimize issues with future major pandas
releases.
The biggest change I'm concerned about is the default copying behavior. That's probably worth a closer look since optimizing memory-usage has been a priority.
Likewise. I will try to get around to running some memory benchmarks using our test suite as a naive first look to comparison this afternoon / tomorrow morning and report what my findings here.
So far, it looks like all tests are passing on pandas==2.0.2
and FWIW it appears that python==3.10
may be finishing faster than 3.8
and 3.9
Thanks for digging through the logs, @jarq6c! Given that we arent experiencing any issues thus far, I am going to close this. We can always reopen if needed in the future.
It was brought to my attention by a colleague that pandas 2.0.0 was released yesterday. Ive not thoroughly checked yet to see if this affects us or not. So far ive just run
nwis_client
,nwm_client
, andmetric
's unit tests usingpandas==2.0.0
onpython 3.9.16
.