scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
233 stars 74 forks source link

Performance regression (>10x) from 4.1.2 to 4.1.3 #572

Closed tamasgal closed 2 years ago

tamasgal commented 2 years ago

Our users observed a significant performance regression (10x and more) and it turns out to be introduced somewhere between 4.1.2 and 4.1.3 (release on 29th September 2021). The awkward version is the latest 1.8.0.

I went through the release notes in https://github.com/scikit-hep/uproot4/releases/tag/4.1.3 and I first reverted https://github.com/scikit-hep/uproot4/pull/441 which also needed the objectarray1d function from uproot._util (see https://github.com/scikit-hep/uproot4/pull/442/files) but it did not help.

I also went through the other PRs but they seemed unrelated, so I started a git bisect with

Good: 382bb62f47a05f6111d3787e3235f9e56d9e113e (4.1.2) Bad: 03cb08ccb1f47ce0b082117f41a112af1ab403ee (4.1.3)

and identified the https://github.com/scikit-hep/uproot4/commit/007b10014003a74d0e89e0166e21a6218de0d445 from @nsmith- which introduced the performance regression:

░ tamasgal@silentbox-(2):uproot4   (git)-[4.1.3~8|bisect]- py-system
░ 13:30:56 > git bisect good
007b10014003a74d0e89e0166e21a6218de0d445 is the first bad commit
commit 007b10014003a74d0e89e0166e21a6218de0d445
Author: Nicholas Smith <nick.smith@cern.ch>
Date:   Mon Sep 20 14:09:59 2021 -0500

    Implement regular array support for TClonesArray (#442)

    * Implement regular array support for TClonesArray

    * Add a small test

    * Fix issue with incidentally regular TRefArray lists

    * Cannot trust dict order

 src/uproot/_util.py                     | 13 +++++++++
 src/uproot/containers.py                | 45 ++++++++++++++++++++++---------
 src/uproot/interpretation/identify.py   |  4 +--
 src/uproot/interpretation/library.py    | 20 +++++++++++---
 src/uproot/interpretation/numerical.py  | 11 ++++++++
 src/uproot/interpretation/objects.py    | 48 +++++++++++++++++++++++++++------
 tests/test_0442-regular-TClonesArray.py | 40 +++++++++++++++++++++++++++
 7 files changed, 155 insertions(+), 26 deletions(-)
 create mode 100644 tests/test_0442-regular-TClonesArray.py

I investigated our codes and found the exact call which causes the problem: it's indeed the reading of doubly nested arrays.

With uproot 4.1.2, the following code is almost instantaneous whereas with uproot 4.1.3 it takes about 1.5 minutes (for a 1.5GB file):

>>> import uproot
>>> f = uproot.open("doubly_nested_regression.root")
>>> f["E/Evt/trks/trks.rec_stages"].array()
<Array [[[1, 2, 5, 3, 5, 4], ... 1], [1], [1]]] type='158262 * var * var * int32'>

Here is the file http://131.188.161.12:30002/doubly_nested_regression.root

@jpivarski you probably remember that you introduced some special hack to squeeze a lot out of doubly-nested structures, see https://github.com/scikit-hep/awkward-1.0/issues/968 and https://github.com/scikit-hep/awkward-1.0/blob/main/src/awkward/_connect/_uproot.py#L35 so my first thought was that these became ineffective with the changes introduced in https://github.com/scikit-hep/uproot4/commit/007b10014003a74d0e89e0166e21a6218de0d445 but I am just speculating ; )

jpivarski commented 2 years ago

If this is related to the hack for performance in doubly-nested structures, that hack will need to give way to AwkwardForth at some point. Adding an AwkwardForth implementation to all types is @aryan26roy's May‒July 2022 project, so... soon!

I actually hadn't realized that you were still relying on that hack. It doesn't even work for all doubly-nested structures, just a particular combination of vectors and pointers.

Is it the case that 007b10014003a74d0e89e0166e21a6218de0d445 is preventing the code flow from entering the gateway into that hack, which is this block:

https://github.com/scikit-hep/uproot4/blob/be92c96e9c3f73ee08277f20e68b81a489f8fdcb/src/uproot/interpretation/objects.py#L142-L154

?

I've started downloading the file to check.

tamasgal commented 2 years ago

Thanks for the quick answer! Indeed, we are heavily relying on these doubly-nested structures, unfortunately. It's roughly 20% of your MC and reconstruction data ;)

Yes I can confirm that awkward_can_optimize returns True for uproot 4.1.2 but False for uproot 4.1.3.

Edit: messed up False and True ;)

jpivarski commented 2 years ago

The backdoor in Awkward Array relies on the type matching exactly, and 007b10014003a74d0e89e0166e21a6218de0d445 changes the "uproot" parameter by one key-value pair. I think that including the "inner_shape" at that point was a mistake, so if @nsmith- actually needs it, I'll show him how to get it through the NumpyForm instead (and then merging this would require coordination with Coffea, so I hope it's simpler than that).

But this is a good reminder of how much of a difference this makes. It will be great to get this all converted to AwkwardForth, and then I won't have to keep calling it a "hack." (Maybe I could call it a "prototype that we didn't go with, in favor of AwkwardForth.")

tamasgal commented 2 years ago

Many thanks for the quick fix :)