xarray-contrib / flox

Fast & furious GroupBy operations for dask.array
https://flox.readthedocs.io
Apache License 2.0
124 stars 18 forks source link

Optimize for-loop merging of cohorts. #378

Closed dcherian closed 3 months ago

dcherian commented 3 months ago

Do this by skipping perfect cohorts that we already know about.

For the watershed problem, the hotspot is now the sparse matrix multiply to calculate containment. Down to 1.18s from 7s!


Unsure why the number of cohorts has gone up for the OISST benchmark

| +        | 243                        | 299                                            |    1.23 | cohorts.OISST.track_num_cohorts                        |

Other benchmarks don't capture the improvement.

| Change   | Before [8e522b5a] <main>   | After [e8b488d1] <optimize-cohorts-for-loop>   |   Ratio | Benchmark (Parameter)                                  |
|----------|----------------------------|------------------------------------------------|---------|--------------------------------------------------------|
| +        | 978                        | 1202                                           |    1.23 | cohorts.OISST.track_num_layers                         |
| +        | 146±0.4ms                  | 178±3ms                                        |    1.21 | cohorts.OISST.time_graph_construct                     |
| +        | 16858                      | 19633                                          |    1.16 | cohorts.OISST.track_num_tasks_optimized                |
| +        | 18798                      | 21684                                          |    1.15 | cohorts.OISST.track_num_tasks                          |
| -        | 201±0.4μs                  | 181±0.7μs                                      |    0.9  | cohorts.ERA5Google.time_find_group_cohorts             |
| -        | 7.96±0.06ms                | 6.96±0.1ms                                     |    0.87 | cohorts.RandomBigArray.time_graph_construct            |
| -        | 5.91±0.08ms                | 4.89±0.02ms                                    |    0.83 | cohorts.RandomBigArray.time_find_group_cohorts         |
| -        | 3.70±0.02ms                | 2.87±0.02ms                                    |    0.78 | cohorts.ERA5MonthHourRechunked.time_find_group_cohorts |
| -        | 3.46±0.01ms                | 2.67±0.02ms                                    |    0.77 | cohorts.ERA5MonthHour.time_find_group_cohorts          |
dcherian commented 3 months ago

Ah found an independent change that reduces number of tiny cohorts:

  | Before [8e522b5a] <main> | After [c9aacf55] | Ratio | Benchmark (Parameter)                                  |
  |--------------------------+------------------+-------+--------------------------------------------------------|
  | 6.52±0.1ms               | 5.88±0.2ms       |   0.9 | cohorts.ERA5MonthHourRechunked.time_graph_construct    |
  | 208±5μs                  | 183±4μs          |  0.88 | cohorts.ERA5Google.time_find_group_cohorts             |
  | 6.55±0.04ms              | 5.74±0.09ms      |  0.88 | cohorts.OISST.time_find_group_cohorts                  |
  | 8.39±0.3ms               | 7.11±0.06ms      |  0.85 | cohorts.RandomBigArray.time_graph_construct            |
  | 5.94±0.06ms              | 4.90±0.02ms      |  0.82 | cohorts.RandomBigArray.time_find_group_cohorts         |
  | 3.74±0.03ms              | 2.97±0.2ms       |  0.79 | cohorts.ERA5MonthHourRechunked.time_find_group_cohorts |
  | 3.52±0.05ms              | 2.69±0.01ms      |  0.76 | cohorts.ERA5MonthHour.time_find_group_cohorts          |
  | 18798                    | 12258            |  0.65 | cohorts.OISST.track_num_tasks                          |
  | 16858                    | 10571            |  0.63 | cohorts.OISST.track_num_tasks_optimized                |
  | 146±0.8ms                | 78.5±2ms         |  0.54 | cohorts.OISST.time_graph_construct                     |
  | 243                      | 117              |  0.48 | cohorts.OISST.track_num_cohorts                        |
  | 978                      | 474              |  0.48 | cohorts.OISST.track_num_layers                         |