Optimize for-loop merging of cohorts.

xarray-contrib / flox

Fast & furious GroupBy operations for dask.array

Apache License 2.0

124 stars 18 forks source link

[x] delete print statements

Do this by skipping perfect cohorts that we already know about.

For the watershed problem, the hotspot is now the sparse matrix multiply to calculate containment. Down to 1.18s from 7s!

Unsure why the number of cohorts has gone up for the OISST benchmark

| +        | 243                        | 299                                            |    1.23 | cohorts.OISST.track_num_cohorts                        |

Other benchmarks don't capture the improvement.

| Change   | Before [8e522b5a] <main>   | After [e8b488d1] <optimize-cohorts-for-loop>   |   Ratio | Benchmark (Parameter)                                  |
|----------|----------------------------|------------------------------------------------|---------|--------------------------------------------------------|
| +        | 978                        | 1202                                           |    1.23 | cohorts.OISST.track_num_layers                         |
| +        | 146±0.4ms                  | 178±3ms                                        |    1.21 | cohorts.OISST.time_graph_construct                     |
| +        | 16858                      | 19633                                          |    1.16 | cohorts.OISST.track_num_tasks_optimized                |
| +        | 18798                      | 21684                                          |    1.15 | cohorts.OISST.track_num_tasks                          |
| -        | 201±0.4μs                  | 181±0.7μs                                      |    0.9  | cohorts.ERA5Google.time_find_group_cohorts             |
| -        | 7.96±0.06ms                | 6.96±0.1ms                                     |    0.87 | cohorts.RandomBigArray.time_graph_construct            |
| -        | 5.91±0.08ms                | 4.89±0.02ms                                    |    0.83 | cohorts.RandomBigArray.time_find_group_cohorts         |
| -        | 3.70±0.02ms                | 2.87±0.02ms                                    |    0.78 | cohorts.ERA5MonthHourRechunked.time_find_group_cohorts |
| -        | 3.46±0.01ms                | 2.67±0.02ms                                    |    0.77 | cohorts.ERA5MonthHour.time_find_group_cohorts          |

| Before [8e522b5a] <main> | After [c9aacf55] | Ratio | Benchmark (Parameter) | |--------------------------+------------------+-------+--------------------------------------------------------| | 6.52±0.1ms | 5.88±0.2ms | 0.9 | cohorts.ERA5MonthHourRechunked.time_graph_construct | | 208±5μs | 183±4μs | 0.88 | cohorts.ERA5Google.time_find_group_cohorts | | 6.55±0.04ms | 5.74±0.09ms | 0.88 | cohorts.OISST.time_find_group_cohorts | | 8.39±0.3ms | 7.11±0.06ms | 0.85 | cohorts.RandomBigArray.time_graph_construct | | 5.94±0.06ms | 4.90±0.02ms | 0.82 | cohorts.RandomBigArray.time_find_group_cohorts | | 3.74±0.03ms | 2.97±0.2ms | 0.79 | cohorts.ERA5MonthHourRechunked.time_find_group_cohorts | | 3.52±0.05ms | 2.69±0.01ms | 0.76 | cohorts.ERA5MonthHour.time_find_group_cohorts | | 18798 | 12258 | 0.65 | cohorts.OISST.track_num_tasks | | 16858 | 10571 | 0.63 | cohorts.OISST.track_num_tasks_optimized | | 146±0.8ms | 78.5±2ms | 0.54 | cohorts.OISST.time_graph_construct | | 243 | 117 | 0.48 | cohorts.OISST.track_num_cohorts | | 978 | 474 | 0.48 | cohorts.OISST.track_num_layers |

xarray-contrib / flox

Optimize for-loop merging of cohorts. #378