Closed lgray closed 1 year ago
@lgray is this duration (~4s) representative of the typical time for a full analysis optimisation?
That's for one dataset so nominally multiply that 4s by 30-40.
Drat, I lost my saved reply!
The short answer to this issue is that the return-on-investment for optimising Awkward here is highly dependent upon the runtime of the true computation. If the computation takes ~hrs, then a two-minute optimisation is not significant in comparison. Especially if single-partition workflows can be used to prototype things before the final computation.
The problem here is really that we're trying to optimise in a naive way, which we touched on briefly in Slack a few weeks back:
In today's meeting, we were talking about possible paths for optimising optimisation. Some ideas:
- we've done pretty well here, in making fast layer fusions and column selection! Now the chief problem is cull
- critical to quantify scale of the problem. Can we do without extra work? Maybe the "interactive" rather than batch approach can be done without distributed and not benefit from (some) optimisations due to lower task overhead
- my suggestion is somewhere above of little class hacks to avoid materialising big dicts and sets and save on memory but not time
- Other potential data types like enums, dataclasses, string conventions, numerical hashes instead of python tuples
- @lgray ’s attempts at compiling inner loops for speed (in numba, cython, rust, ets.). If not numba, would greatly complicate build and distribution. I have my doubts on how much mole-whacking can or should be done here.
- @lgray just made a way to select only one partition throughout the gaph (if it is all blockwise or treereduction). Can we use something based on that as out cull() and ignore dask's version?
Ultimately, Dask needs to do less here, in my view. In Awkward, I don't see any hotspots, so shaving this down by e.g. 50% is going to be a lot of work that will probably have an impact on readability.
That said, could you give https://github.com/scikit-hep/awkward/pull/2464 a try? There is some low hanging fruit in ak._behavior
.
I will try it soon! Thanks!
And yeah I understand the trade-offs and investments. This is why I marked this lower priority, since it's kinda thorny, and not immediately pressing, and the payoff is somewhat unclear.
My one worry is that people do spend a fair amount of time running over all datasets but with a reduced number of files where turn-around time for an answer is minutes, and then optimize becomes a sticking issue because they're looping over all datasets still but not processing them completely yet still paying full "taxes" for compiling the graphs.
Re-reading @martindurant's notes - I understand what he's talking about with the cull now! We can do a bit of a switch-a-roo, getting the graph topology from 'faking' one partition of data and then insert the necessary keys.
I can see how to do that! I'll try it soon.
That may be a better solution than numba (but I would still like to pursue that).
https://github.com/scikit-hep/awkward/pull/2464 shaves nearly a second off of dak.necessary_columns
! (on mac laptop, someone should check on standard linux setup)
I'm closing this for now; Dask's just doing quite a lot of (redundant, in this case) work. From looking at this, I didn't see any more low hanging fruit, though it may be possible to revisit the problem in future.
OK - we should give a tiny peek to the churn that was introduced with awkward 2.2.2 since that started setting off warnings from a time-spent-in-gc threshold in dask. (unless that's what you meant you looked at)
Do you have a reproducer for that? It would be good to profile it, if we can reproduce on a single-threaded cluster.
Using: https://github.com/dask-contrib/dask-awkward/files/11710067/complex_repro.zip
coffea 2023.6.0rc0 + awkward 2.2.1 pyinstrument for dak.necessary_columns
(which is single threaded execution on empty data):
time col: 3.519811334
_ ._ __/__ _ _ _ _ _/_ Recorded: 16:00:47 Samples: 3369
/_//_/// /_\ / //_// / //_'/ // Duration: 3.520 CPU time: 3.535
/ _/ v4.3.0
Program: yimu_test5.py
3.519 <module> yimu_test5.py:1
└─ 3.519 necessary_columns dask_awkward/lib/inspect.py:11
└─ 3.519 _necessary_columns dask_awkward/lib/optimize.py:390
└─ 3.516 _get_column_reports dask_awkward/lib/optimize.py:333
├─ 3.436 get_sync dask/local.py:551
│ └─ 3.435 get_async dask/local.py:350
│ ├─ 3.020 fire_tasks dask/local.py:452
│ │ └─ 2.956 submit dask/local.py:539
│ │ └─ 2.910 batch_execute_tasks dask/local.py:234
│ │ └─ 2.908 <listcomp> dask/local.py:238
│ │ └─ 2.908 execute_task dask/local.py:214
│ │ └─ 2.898 _execute_task dask/core.py:82
│ │ └─ 2.866 __call__ dask/optimization.py:987
│ │ └─ 2.855 get dask/core.py:128
│ │ └─ 2.825 _execute_task dask/core.py:82
│ │ ├─ 0.884 __array_ufunc__ awkward/highlevel.py:1291
│ │ │ └─ 0.843 array_ufunc awkward/_connect/numpy.py:213
│ │ │ ├─ 0.548 broadcast_and_apply awkward/_broadcasting.py:1063
│ │ │ │ └─ 0.523 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.498 continuation awkward/_broadcasting.py:1008
│ │ │ │ └─ 0.491 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ └─ 0.441 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.412 continuation awkward/_broadcasting.py:1008
│ │ │ │ ├─ 0.370 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ │ └─ 0.338 apply_step awkward/_broadcasting.py:355
│ │ │ │ │ ├─ 0.144 continuation awkward/_broadcasting.py:1008
│ │ │ │ │ │ └─ 0.119 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ │ │ └─ 0.100 apply_step awkward/_broadcasting.py:355
│ │ │ │ │ │ └─ 0.073 action awkward/_connect/numpy.py:222
│ │ │ │ │ │ └─ 0.063 find_ufunc awkward/_behavior.py:97
│ │ │ │ │ ├─ 0.122 [self]
│ │ │ │ │ └─ 0.066 action awkward/_connect/numpy.py:222
│ │ │ │ │ └─ 0.052 find_ufunc awkward/_behavior.py:97
│ │ │ │ └─ 0.036 broadcast_any_indexed awkward/_broadcasting.py:964
│ │ │ ├─ 0.184 recursively_apply awkward/_do.py:20
│ │ │ │ └─ 0.141 _recursively_apply awkward/contents/listoffsetarray.py:2071
│ │ │ │ └─ 0.135 continuation awkward/contents/listoffsetarray.py:2086
│ │ │ │ └─ 0.095 _recursively_apply awkward/contents/numpyarray.py:1261
│ │ │ │ └─ 0.095 unary_action awkward/_connect/numpy.py:303
│ │ │ │ └─ 0.095 action awkward/_connect/numpy.py:222
│ │ │ │ └─ 0.079 find_ufunc awkward/_behavior.py:97
│ │ │ │ └─ 0.038 __iter__ _collections_abc.py:742
│ │ │ ├─ 0.049 _array_ufunc_custom_cast awkward/_connect/numpy.py:141
│ │ │ └─ 0.036 wrap_layout awkward/_layout.py:19
│ │ ├─ 0.436 apply dask/utils.py:41
│ │ │ ├─ 0.218 sum awkward/operations/ak_sum.py:13
│ │ │ │ └─ 0.202 _impl awkward/operations/ak_sum.py:267
│ │ │ │ └─ 0.187 reduce awkward/_do.py:262
│ │ │ │ └─ 0.140 _reduce_next awkward/contents/listoffsetarray.py:1464
│ │ │ │ └─ 0.045 _reduce_next awkward/contents/listoffsetarray.py:1464
│ │ │ ├─ 0.065 count awkward/operations/ak_count.py:12
│ │ │ │ └─ 0.061 _impl awkward/operations/ak_count.py:110
│ │ │ │ └─ 0.058 reduce awkward/_do.py:262
│ │ │ │ └─ 0.037 _reduce_next awkward/contents/listoffsetarray.py:1464
│ │ │ ├─ 0.058 mean awkward/operations/ak_mean.py:13
│ │ │ │ └─ 0.057 _impl awkward/operations/ak_mean.py:152
│ │ │ └─ 0.040 min awkward/operations/ak_min.py:13
│ │ │ └─ 0.040 _impl awkward/operations/ak_min.py:148
│ │ │ └─ 0.038 reduce awkward/_do.py:262
│ │ ├─ 0.433 __call__ dask_awkward/lib/core.py:1895
│ │ │ ├─ 0.321 pt coffea/nanoevents/methods/vector.py:121
│ │ │ │ └─ 0.320 r coffea/nanoevents/methods/vector.py:85
│ │ │ │ └─ 0.293 r2 coffea/nanoevents/methods/vector.py:111
│ │ │ │ └─ 0.261 func numpy/lib/mixins.py:18
│ │ │ │ └─ 0.260 __array_ufunc__ awkward/highlevel.py:1291
│ │ │ │ └─ 0.249 array_ufunc awkward/_connect/numpy.py:213
│ │ │ │ └─ 0.223 broadcast_and_apply awkward/_broadcasting.py:1063
│ │ │ │ └─ 0.217 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.205 continuation awkward/_broadcasting.py:1008
│ │ │ │ └─ 0.203 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ └─ 0.192 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.188 continuation awkward/_broadcasting.py:1008
│ │ │ │ ├─ 0.120 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ │ └─ 0.106 apply_step awkward/_broadcasting.py:355
│ │ │ │ │ └─ 0.093 continuation awkward/_broadcasting.py:1008
│ │ │ │ │ └─ 0.074 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ │ └─ 0.063 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.065 broadcast_any_indexed awkward/_broadcasting.py:964
│ │ │ │ └─ 0.043 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.043 continuation awkward/_broadcasting.py:1008
│ │ │ │ └─ 0.043 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ └─ 0.040 apply_step awkward/_broadcasting.py:355
│ │ │ │ └─ 0.040 continuation awkward/_broadcasting.py:1008
│ │ │ │ └─ 0.040 broadcast_any_indexed awkward/_broadcasting.py:964
│ │ │ └─ 0.106 eta coffea/nanoevents/methods/vector.py:512
│ │ │ └─ 0.077 r coffea/nanoevents/methods/vector.py:85
│ │ │ └─ 0.069 r2 coffea/nanoevents/methods/vector.py:111
│ │ │ └─ 0.062 func numpy/lib/mixins.py:18
│ │ │ └─ 0.062 __array_ufunc__ awkward/highlevel.py:1291
│ │ │ └─ 0.057 array_ufunc awkward/_connect/numpy.py:213
│ │ │ └─ 0.052 broadcast_and_apply awkward/_broadcasting.py:1063
│ │ │ └─ 0.051 apply_step awkward/_broadcasting.py:355
│ │ │ └─ 0.051 continuation awkward/_broadcasting.py:1008
│ │ │ └─ 0.051 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ └─ 0.044 apply_step awkward/_broadcasting.py:355
│ │ │ └─ 0.041 continuation awkward/_broadcasting.py:1008
│ │ │ └─ 0.041 broadcast_any_list awkward/_broadcasting.py:485
│ │ ├─ 0.410 __call__ dask_awkward/lib/structure.py:860
│ │ │ └─ 0.410 with_field awkward/operations/ak_with_field.py:19
│ │ │ └─ 0.387 _impl awkward/operations/ak_with_field.py:55
│ │ │ └─ 0.361 broadcast_and_apply awkward/_broadcasting.py:1063
│ │ │ └─ 0.352 apply_step awkward/_broadcasting.py:355
│ │ │ └─ 0.350 continuation awkward/_broadcasting.py:1008
│ │ │ └─ 0.349 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ └─ 0.337 apply_step awkward/_broadcasting.py:355
│ │ │ ├─ 0.280 continuation awkward/_broadcasting.py:1008
│ │ │ │ └─ 0.266 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ └─ 0.247 apply_step awkward/_broadcasting.py:355
│ │ │ │ ├─ 0.152 continuation awkward/_broadcasting.py:1008
│ │ │ │ │ ├─ 0.093 broadcast_any_indexed awkward/_broadcasting.py:964
│ │ │ │ │ │ └─ 0.076 <listcomp> awkward/_broadcasting.py:965
│ │ │ │ │ │ └─ 0.076 project awkward/contents/indexedarray.py:408
│ │ │ │ │ │ └─ 0.074 _carry awkward/contents/recordarray.py:505
│ │ │ │ │ │ └─ 0.073 <listcomp> awkward/contents/recordarray.py:526
│ │ │ │ │ │ └─ 0.063 _carry awkward/contents/numpyarray.py:315
│ │ │ │ │ │ └─ 0.044 __getitem__ awkward/_nplikes/typetracer.py:312
│ │ │ │ │ └─ 0.058 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ │ │ └─ 0.058 apply_step awkward/_broadcasting.py:355
│ │ │ │ │ └─ 0.054 continuation awkward/_broadcasting.py:1008
│ │ │ │ │ └─ 0.054 broadcast_any_indexed awkward/_broadcasting.py:964
│ │ │ │ │ └─ 0.049 <listcomp> awkward/_broadcasting.py:965
│ │ │ │ │ └─ 0.049 project awkward/contents/indexedarray.py:408
│ │ │ │ │ └─ 0.047 _carry awkward/contents/recordarray.py:505
│ │ │ │ │ └─ 0.047 <listcomp> awkward/contents/recordarray.py:526
│ │ │ │ │ └─ 0.041 _carry awkward/contents/numpyarray.py:315
│ │ │ │ └─ 0.087 action awkward/operations/ak_with_field.py:123
│ │ │ │ └─ 0.078 <listcomp> awkward/operations/ak_with_field.py:156
│ │ │ │ └─ 0.075 __getitem__ awkward/contents/content.py:535
│ │ │ │ └─ 0.071 _getitem awkward/contents/content.py:538
│ │ │ │ └─ 0.046 _getitem_field awkward/contents/recordarray.py:467
│ │ │ │ └─ 0.043 content awkward/contents/recordarray.py:405
│ │ │ │ └─ 0.036 content awkward/forms/recordform.py:140
│ │ │ └─ 0.057 action awkward/operations/ak_with_field.py:123
│ │ │ └─ 0.051 <listcomp> awkward/operations/ak_with_field.py:156
│ │ │ └─ 0.051 __getitem__ awkward/contents/content.py:535
│ │ │ └─ 0.049 _getitem awkward/contents/content.py:538
│ │ ├─ 0.363 __getitem__ awkward/highlevel.py:520
│ │ │ ├─ 0.291 __getitem__ awkward/contents/content.py:535
│ │ │ │ └─ 0.291 _getitem awkward/contents/content.py:538
│ │ │ │ ├─ 0.173 _getitem awkward/contents/content.py:538
│ │ │ │ │ └─ 0.152 _getitem awkward/contents/content.py:538
│ │ │ │ │ └─ 0.116 _getitem_next awkward/contents/regulararray.py:451
│ │ │ │ │ └─ 0.101 _getitem_next_jagged awkward/contents/listoffsetarray.py:416
│ │ │ │ │ └─ 0.083 _getitem_next_jagged awkward/contents/listarray.py:389
│ │ │ │ │ └─ 0.043 _getitem_next_jagged awkward/contents/listoffsetarray.py:416
│ │ │ │ └─ 0.056 _getitem_next awkward/contents/regulararray.py:451
│ │ │ └─ 0.039 wrap_layout awkward/_layout.py:19
│ │ ├─ 0.147 __call__ dask_awkward/lib/core.py:1887
│ │ │ └─ 0.147 delta_r coffea/nanoevents/methods/vector.py:602
│ │ │ └─ 0.085 eta coffea/nanoevents/methods/vector.py:512
│ │ │ └─ 0.063 r coffea/nanoevents/methods/vector.py:85
│ │ │ └─ 0.057 r2 coffea/nanoevents/methods/vector.py:111
│ │ │ └─ 0.050 func numpy/lib/mixins.py:18
│ │ │ └─ 0.050 __array_ufunc__ awkward/highlevel.py:1291
│ │ │ └─ 0.046 array_ufunc awkward/_connect/numpy.py:213
│ │ │ └─ 0.041 broadcast_and_apply awkward/_broadcasting.py:1063
│ │ │ └─ 0.041 apply_step awkward/_broadcasting.py:355
│ │ │ └─ 0.041 continuation awkward/_broadcasting.py:1008
│ │ │ └─ 0.041 broadcast_any_list awkward/_broadcasting.py:485
│ │ │ └─ 0.038 apply_step awkward/_broadcasting.py:355
│ │ │ └─ 0.036 continuation awkward/_broadcasting.py:1008
│ │ │ └─ 0.036 broadcast_any_list awkward/_broadcasting.py:485
│ │ └─ 0.059 __call__ dask_awkward/lib/structure.py:806
│ │ └─ 0.059 where awkward/operations/ak_where.py:14
│ │ └─ 0.055 _impl3 awkward/operations/ak_where.py:97
│ │ └─ 0.050 broadcast_and_apply awkward/_broadcasting.py:1063
│ │ └─ 0.048 apply_step awkward/_broadcasting.py:355
│ │ └─ 0.044 continuation awkward/_broadcasting.py:1008
│ │ └─ 0.043 broadcast_any_list awkward/_broadcasting.py:485
│ │ └─ 0.041 apply_step awkward/_broadcasting.py:355
│ │ └─ 0.036 continuation awkward/_broadcasting.py:1008
│ ├─ 0.255 keys dask/highlevelgraph.py:530
│ │ └─ 0.255 to_dict dask/highlevelgraph.py:522
│ │ └─ 0.255 ensure_dict dask/utils.py:1236
│ │ └─ 0.238 __iter__ _collections_abc.py:719
│ │ └─ 0.235 __iter__ dask/blockwise.py:493
│ │ └─ 0.233 _dict dask/blockwise.py:452
│ │ ├─ 0.067 fuse dask/optimization.py:450
│ │ │ └─ 0.036 get dask/config.py:520
│ │ ├─ 0.060 dims dask/blockwise.py:440
│ │ │ └─ 0.058 _make_dims dask/blockwise.py:1480
│ │ │ └─ 0.057 broadcast_dimensions dask/blockwise.py:1420
│ │ ├─ 0.046 make_blockwise_graph dask/blockwise.py:759
│ │ └─ 0.038 __init__ dask/optimization.py:965
│ └─ 0.057 order dask/order.py:84
└─ 0.040 mock dask_awkward/layers/layers.py:94
coffea@main + awkward >= 2.2.2 pyinstrument for the same:
_ ._ __/__ _ _ _ _ _/_ Recorded: 16:03:20 Samples: 4944
/_//_/// /_\ / //_// / //_'/ // Duration: 5.126 CPU time: 5.142
/ _/ v4.3.0
Program: yimu_test5.py
5.126 <module> yimu_test5.py:1
└─ 5.126 necessary_columns dask_awkward/lib/inspect.py:15
└─ 5.126 _necessary_columns dask_awkward/lib/optimize.py:402
└─ 5.122 _get_column_reports dask_awkward/lib/optimize.py:343
└─ 5.034 get_sync dask/local.py:551
└─ 5.032 get_async dask/local.py:350
├─ 4.467 fire_tasks dask/local.py:452
│ └─ 4.395 submit dask/local.py:539
│ └─ 4.354 batch_execute_tasks dask/local.py:234
│ └─ 4.351 <listcomp> dask/local.py:238
│ └─ 4.349 execute_task dask/local.py:214
│ └─ 4.343 _execute_task dask/core.py:82
│ └─ 4.312 __call__ dask/optimization.py:987
│ └─ 4.306 get dask/core.py:128
│ └─ 4.264 _execute_task dask/core.py:82
│ ├─ 1.262 __array_ufunc__ awkward/highlevel.py:1291
│ │ └─ 1.207 array_ufunc awkward/_connect/numpy.py:213
│ │ ├─ 0.820 broadcast_and_apply awkward/_broadcasting.py:1041
│ │ │ └─ 0.799 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.770 continuation awkward/_broadcasting.py:986
│ │ │ └─ 0.764 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ └─ 0.724 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.680 continuation awkward/_broadcasting.py:986
│ │ │ ├─ 0.568 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ │ ├─ 0.376 apply_step awkward/_broadcasting.py:362
│ │ │ │ │ └─ 0.323 continuation awkward/_broadcasting.py:986
│ │ │ │ │ ├─ 0.199 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ │ │ │ ├─ 0.087 apply_step awkward/_broadcasting.py:362
│ │ │ │ │ │ │ └─ 0.083 action awkward/_connect/numpy.py:222
│ │ │ │ │ │ │ └─ 0.060 find_ufunc awkward/_behavior.py:84
│ │ │ │ │ │ └─ 0.054 _broadcast_tooffsets64 awkward/contents/listoffsetarray.py:360
│ │ │ │ │ ├─ 0.065 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ │ │ └─ 0.058 broadcast_any_option awkward/_broadcasting.py:754
│ │ │ │ └─ 0.137 _broadcast_tooffsets64 awkward/contents/listoffsetarray.py:360
│ │ │ └─ 0.106 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ └─ 0.084 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.061 continuation awkward/_broadcasting.py:986
│ │ │ └─ 0.061 broadcast_any_list awkward/_broadcasting.py:488
│ │ ├─ 0.256 recursively_apply awkward/_do.py:20
│ │ │ ├─ 0.178 _recursively_apply awkward/contents/listoffsetarray.py:2053
│ │ │ │ └─ 0.174 continuation awkward/contents/listoffsetarray.py:2068
│ │ │ │ └─ 0.119 _recursively_apply awkward/contents/numpyarray.py:1235
│ │ │ │ └─ 0.118 unary_action awkward/_connect/numpy.py:307
│ │ │ │ └─ 0.118 action awkward/_connect/numpy.py:222
│ │ │ │ └─ 0.083 find_ufunc awkward/_behavior.py:84
│ │ │ └─ 0.059 _recursively_apply awkward/contents/indexedarray.py:1041
│ │ │ └─ 0.058 continuation awkward/contents/indexedarray.py:1066
│ │ └─ 0.073 _array_ufunc_custom_cast awkward/_connect/numpy.py:141
│ ├─ 0.783 __call__ dask_awkward/lib/core.py:1994
│ │ ├─ 0.572 pt coffea/nanoevents/methods/vector.py:121
│ │ │ └─ 0.572 r coffea/nanoevents/methods/vector.py:85
│ │ │ └─ 0.542 r2 coffea/nanoevents/methods/vector.py:111
│ │ │ └─ 0.498 func numpy/lib/mixins.py:18
│ │ │ └─ 0.497 __array_ufunc__ awkward/highlevel.py:1291
│ │ │ └─ 0.487 array_ufunc awkward/_connect/numpy.py:213
│ │ │ └─ 0.460 broadcast_and_apply awkward/_broadcasting.py:1041
│ │ │ └─ 0.454 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.439 continuation awkward/_broadcasting.py:986
│ │ │ └─ 0.439 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ └─ 0.418 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.412 continuation awkward/_broadcasting.py:986
│ │ │ ├─ 0.255 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ │ └─ 0.186 apply_step awkward/_broadcasting.py:362
│ │ │ │ └─ 0.178 continuation awkward/_broadcasting.py:986
│ │ │ │ ├─ 0.113 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ │ │ └─ 0.071 apply_step awkward/_broadcasting.py:362
│ │ │ │ │ └─ 0.066 continuation awkward/_broadcasting.py:986
│ │ │ │ │ └─ 0.066 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ │ │ └─ 0.054 apply_step awkward/_broadcasting.py:362
│ │ │ │ │ └─ 0.053 continuation awkward/_broadcasting.py:986
│ │ │ │ │ └─ 0.052 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ │ └─ 0.064 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ └─ 0.154 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ └─ 0.120 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.114 continuation awkward/_broadcasting.py:986
│ │ │ └─ 0.114 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ └─ 0.087 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.084 continuation awkward/_broadcasting.py:986
│ │ │ └─ 0.084 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ └─ 0.056 apply_step awkward/_broadcasting.py:362
│ │ └─ 0.191 eta coffea/nanoevents/methods/vector.py:512
│ │ └─ 0.139 r coffea/nanoevents/methods/vector.py:85
│ │ └─ 0.130 r2 coffea/nanoevents/methods/vector.py:111
│ │ └─ 0.120 func numpy/lib/mixins.py:18
│ │ └─ 0.120 __array_ufunc__ awkward/highlevel.py:1291
│ │ └─ 0.117 array_ufunc awkward/_connect/numpy.py:213
│ │ └─ 0.106 broadcast_and_apply awkward/_broadcasting.py:1041
│ │ └─ 0.103 apply_step awkward/_broadcasting.py:362
│ │ └─ 0.100 continuation awkward/_broadcasting.py:986
│ │ └─ 0.100 broadcast_any_list awkward/_broadcasting.py:488
│ │ └─ 0.096 apply_step awkward/_broadcasting.py:362
│ │ └─ 0.093 continuation awkward/_broadcasting.py:986
│ ├─ 0.748 __call__ dask_awkward/lib/structure.py:860
│ │ └─ 0.748 with_field awkward/operations/ak_with_field.py:19
│ │ └─ 0.726 _impl awkward/operations/ak_with_field.py:55
│ │ └─ 0.693 broadcast_and_apply awkward/_broadcasting.py:1041
│ │ └─ 0.640 apply_step awkward/_broadcasting.py:362
│ │ └─ 0.638 continuation awkward/_broadcasting.py:986
│ │ └─ 0.634 broadcast_any_list awkward/_broadcasting.py:488
│ │ ├─ 0.573 apply_step awkward/_broadcasting.py:362
│ │ │ ├─ 0.518 continuation awkward/_broadcasting.py:986
│ │ │ │ └─ 0.510 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ │ ├─ 0.251 apply_step awkward/_broadcasting.py:362
│ │ │ │ │ ├─ 0.169 continuation awkward/_broadcasting.py:986
│ │ │ │ │ │ ├─ 0.102 broadcast_any_indexed awkward/_broadcasting.py:936
│ │ │ │ │ │ │ └─ 0.080 <listcomp> awkward/_broadcasting.py:940
│ │ │ │ │ │ │ └─ 0.079 _push_inside_record_or_project awkward/contents/indexedarray.py:1140
│ │ │ │ │ │ │ └─ 0.075 <listcomp> awkward/contents/indexedarray.py:1143
│ │ │ │ │ │ │ └─ 0.074 simplified awkward/contents/indexedarray.py:152
│ │ │ │ │ │ └─ 0.067 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ │ │ │ └─ 0.054 apply_step awkward/_broadcasting.py:362
│ │ │ │ │ └─ 0.081 action awkward/operations/ak_with_field.py:123
│ │ │ │ │ └─ 0.072 <listcomp> awkward/operations/ak_with_field.py:156
│ │ │ │ │ └─ 0.070 __getitem__ awkward/contents/content.py:509
│ │ │ │ │ └─ 0.065 _getitem awkward/contents/content.py:512
│ │ │ │ └─ 0.224 _broadcast_tooffsets64 awkward/contents/listoffsetarray.py:360
│ │ │ │ └─ 0.175 __getitem__ awkward/contents/content.py:509
│ │ │ │ └─ 0.175 _getitem awkward/contents/content.py:512
│ │ │ │ └─ 0.147 _getitem_range awkward/contents/recordarray.py:444
│ │ │ │ └─ 0.121 <listcomp> awkward/contents/recordarray.py:462
│ │ │ │ └─ 0.079 _getitem_range awkward/contents/numpyarray.py:305
│ │ │ │ └─ 0.061 __getitem__ awkward/_nplikes/typetracer.py:304
│ │ │ └─ 0.052 action awkward/operations/ak_with_field.py:123
│ │ └─ 0.052 __getitem__ awkward/contents/content.py:509
│ ├─ 0.578 apply dask/utils.py:41
│ │ ├─ 0.266 sum awkward/operations/ak_sum.py:13
│ │ │ └─ 0.241 _impl awkward/operations/ak_sum.py:267
│ │ │ └─ 0.233 reduce awkward/_do.py:262
│ │ │ └─ 0.185 _reduce_next awkward/contents/listoffsetarray.py:1444
│ │ ├─ 0.103 mean awkward/operations/ak_mean.py:13
│ │ │ └─ 0.100 _impl awkward/operations/ak_mean.py:152
│ │ └─ 0.083 count awkward/operations/ak_count.py:12
│ │ └─ 0.080 _impl awkward/operations/ak_count.py:110
│ │ └─ 0.077 reduce awkward/_do.py:262
│ ├─ 0.413 __getitem__ awkward/highlevel.py:520
│ │ └─ 0.338 __getitem__ awkward/contents/content.py:509
│ │ └─ 0.338 _getitem awkward/contents/content.py:512
│ │ ├─ 0.209 _getitem awkward/contents/content.py:512
│ │ │ └─ 0.189 _getitem awkward/contents/content.py:512
│ │ │ └─ 0.140 _getitem_next awkward/contents/regulararray.py:467
│ │ │ └─ 0.122 _getitem_next_jagged awkward/contents/listoffsetarray.py:403
│ │ │ └─ 0.110 _getitem_next_jagged awkward/contents/listarray.py:443
│ │ └─ 0.056 _getitem_next awkward/contents/regulararray.py:467
│ ├─ 0.257 __call__ dask_awkward/lib/core.py:1986
│ │ └─ 0.257 delta_r coffea/nanoevents/methods/vector.py:602
│ │ ├─ 0.133 eta coffea/nanoevents/methods/vector.py:512
│ │ │ └─ 0.101 r coffea/nanoevents/methods/vector.py:85
│ │ │ └─ 0.092 r2 coffea/nanoevents/methods/vector.py:111
│ │ │ └─ 0.076 func numpy/lib/mixins.py:18
│ │ │ └─ 0.076 __array_ufunc__ awkward/highlevel.py:1291
│ │ │ └─ 0.073 array_ufunc awkward/_connect/numpy.py:213
│ │ │ └─ 0.065 broadcast_and_apply awkward/_broadcasting.py:1041
│ │ │ └─ 0.064 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.062 continuation awkward/_broadcasting.py:986
│ │ │ └─ 0.062 broadcast_any_list awkward/_broadcasting.py:488
│ │ │ └─ 0.057 apply_step awkward/_broadcasting.py:362
│ │ │ └─ 0.056 continuation awkward/_broadcasting.py:986
│ │ └─ 0.064 delta_phi coffea/nanoevents/methods/vector.py:199
│ └─ 0.081 __call__ dask_awkward/lib/structure.py:806
│ └─ 0.081 where awkward/operations/ak_where.py:14
│ └─ 0.078 _impl3 awkward/operations/ak_where.py:97
│ └─ 0.075 broadcast_and_apply awkward/_broadcasting.py:1041
│ └─ 0.072 apply_step awkward/_broadcasting.py:362
│ └─ 0.072 continuation awkward/_broadcasting.py:986
│ └─ 0.072 broadcast_any_list awkward/_broadcasting.py:488
│ └─ 0.068 apply_step awkward/_broadcasting.py:362
│ └─ 0.061 continuation awkward/_broadcasting.py:986
├─ 0.401 keys dask/highlevelgraph.py:530
│ └─ 0.401 to_dict dask/highlevelgraph.py:522
│ └─ 0.401 ensure_dict dask/utils.py:1236
│ └─ 0.388 __iter__ _collections_abc.py:719
│ └─ 0.386 __iter__ dask/blockwise.py:493
│ └─ 0.385 _dict dask/blockwise.py:452
│ ├─ 0.188 make_blockwise_graph dask/blockwise.py:759
│ │ └─ 0.173 _get_coord_mapping dask/blockwise.py:665
│ │ └─ 0.169 [self]
│ ├─ 0.073 fuse dask/optimization.py:450
│ └─ 0.058 dims dask/blockwise.py:440
│ └─ 0.055 _make_dims dask/blockwise.py:1480
│ └─ 0.052 broadcast_dimensions dask/blockwise.py:1420
└─ 0.062 order dask/order.py:84
Also emits some warnings as follows in the latter:
2023-06-16 16:02:27,243 - distributed.utils_perf - WARNING - full garbage collections took 12% CPU time recently (threshold: 10%)
2023-06-16 16:02:28,004 - distributed.utils_perf - WARNING - full garbage collections took 12% CPU time recently (threshold: 10%)
2023-06-16 16:02:28,544 - distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
2023-06-16 16:02:29,077 - distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
2023-06-16 16:02:29,592 - distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
2023-06-16 16:02:30,105 - distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
2023-06-16 16:02:30,635 - distributed.utils_perf - WARNING - full garbage collections took 16% CPU time recently (threshold: 10%)
2023-06-16 16:02:31,183 - distributed.utils_perf - WARNING - full garbage collections took 16% CPU time recently (threshold: 10%)
And again that's OSX so perhaps a grain of salt but it was a very noticeable slowdown compared to 2.2.1
Version of Awkward Array
2.2.0
Description and code to reproduce
N.B.: This is lower priority, but it scales with analysis size/complexity!
When running column optimization in dask awkward a typetracer is passed through the whole analysis code. This triggers all relevant function calls for the analysis and produces an accurate accounting of data needs which is great, but as analyses grow larger (especially with systematics overhead!) the tracing begins to take quite some time for processing no data.
I understand this is only so reducible, but responsiveness is definitely part of user experience and it would be sad if we lost users because of really excellent core functionality taking too long.
Here is a pyinstrument trace showing what functions stacks are grabbing the most time in a mostly realistic analysis with full systematics:
I am happy to provide this test code if it is needed! (this cannot really be made minimal as it's ~emergent behavior)