pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.49k stars 1.04k forks source link

Revert addition of fastpath causing incompatibilities with cftime #9192

Open hmaarrfk opened 3 days ago

hmaarrfk commented 3 days ago

See discussion in #9138 Fixes #9138

This commit and pull request mostly serves as a staging group for a potential fix.

Test with:

pytest xarray/tests/test_cftimeindex.py::test_cftime_noleap_with_str
hmaarrfk commented 3 days ago

The comment https://github.com/pydata/xarray/pull/9002/files#r1591214299 had the right intuition. The fastpath was the source of problems.

Empirically, i tested to see if it had any effects on the internal benchmarks that brought me to introducing it and it seems it does not (we avoid pandas quite heavily due to slowdowns it introduces, our data access is quite regular).

hmaarrfk commented 3 days ago

ps. i have visitors this week, so feel free to push to adjust anything you feel the need to change.

hmaarrfk commented 1 day ago

Not saying anything should change here, just giving the results of the benchmark

asv continuous main calendar_noleap --bench indexing

| Change   | Before [42ed6d30] <main>   | After [822c0b8e] <calendar_noleap>   |   Ratio | Benchmark (Parameter)                                      |
|----------|----------------------------|--------------------------------------|---------|------------------------------------------------------------|
| +        | 337±1μs                    | 618±1μs                              |    1.83 | indexing.AssignmentOptimized.time_assign_identical_indexes |
| +        | 67.4±0.6μs                 | 77.7±0.5μs                           |    1.15 | indexing.HugeAxisSmallSliceIndexing.time_indexing          |
| +        | 139±0.7μs                  | 159±0.5μs                            |    1.14 | indexing.Assignment.time_assignment_basic('1slice')        |
max-sixty commented 1 day ago

@pydata/xarray who is the best person to review this?

dcherian commented 1 day ago

I believe the solution in https://github.com/pydata/xarray/issues/9138#issuecomment-2198145554 is a better path