scikit-hep / awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.
BSD 3-Clause "New" or "Revised" License
215 stars 39 forks source link

Can't import awkward.numba #226

Open beojan opened 4 years ago

beojan commented 4 years ago

I get the error:

/usr/lib/python3.8/site-packages/awkward/numba/__init__.py in <module>
      5 import numpy
      6 
----> 7 from awkward.numba.array.base import NumbaMethods
      8 from awkward.numba.array.chunked import ChunkedArrayNumba as ChunkedArray
      9 from awkward.numba.array.chunked import AppendableArrayNumba as AppendableArray

/usr/lib/python3.8/site-packages/awkward/numba/array/base.py in <module>
    161 @numba.extending.lower_builtin("iternext", AwkwardArrayIteratorType)
    162 @numba.targets.imputils.iternext_impl
--> 163 def _AwkwardArray_lower_iternext(context, builder, sig, args, result):
    164     iteratortype, = sig.args
    165     iteratorval, = args

/usr/lib/python3.8/site-packages/numba/targets/imputils.py in iternext_impl(ref_type)
    325     """
    326     if ref_type not in [x for x in RefType]:
--> 327         raise ValueError("ref_type must be an enum member of imputils.RefType")
    328 
    329     def outer(func):

ValueError: ref_type must be an enum member of imputils.RefType
jpivarski commented 4 years ago

Don't import awkward.numba. Its development stalled in February 2019 as it became clear that the only way to implement it was to duplicate all of the main library in lowered Numba extensions, and it was hard enough to maintain consistency in one library.

The Awkward 1.0 project addresses this duplication by putting a common layer under both, so that each algorithm only needs to be written one time. This rewrite is not complete, but it is already more complete than the original awkward.numba was. (Original awkward.numba had only JaggedArray; the new implementation already has ListArray and ListOffsetArray, which replace JaggedArray, as well as RecordArray, which replaces Table, and RegularArray and EmptyArray, which are needed for technical reasons.)

I didn't know that awkward.numba was non-importable. But as you can see, it wouldn't make any sense to devote resources to fix it, given that the new version is already ahead of it.

chrisburr commented 4 years ago

To add, if you need to use the legacy awkward.numba for an old script you can by downgrading numba to version 0.45.

(I did this yesterday for the conda package: https://github.com/conda-forge/awkward-feedstock(/pull/43/commits/0fdf93bc031dc68793a4492182a5216ab91e0ad7)

beojan commented 4 years ago

I assume there's no way to use awkward 1.0 with uproot now (pre uproot 4.0), then?

I wanted to use numba to vectorize a function over numpy arrays to make a sliding cut on some JaggedArrays. That is, numba doesn't need to loop over the JaggedArrays. Do I even need to import anything different for this?

jpivarski commented 4 years ago

Extract the NumPy arrays from your JaggedArrays, write Numba algorithms on those, and then wrap up the results in new JaggedArrays (if needed). That's what Coffea people are doing while waiting for Awkward 1 to be finished.

HDembinski commented 4 years ago

I am also doing this. Could you please add an example on how to do this to the docs?

Example to sum the energy of all particles in an event, the particle energies are passed as a jaggedarray with the event index as the first dimension and the particle index per event as the second (this can probably be done with the awkward API, I chose this simple example for illustration).

@njit
def _total_energy_per_event(starts, stops, content):
    result = np.empty(len(starts))
    k = 0
    for a, b in zip(starts, stops):
         sub = content[a:b]
         result[k] = np.sum(sub)
         k += 1
    return result

# wrapper
def total_energy_per_event(energies):
    _total_energy_per_event(energies.starts, energies.stops, energies.content)
jpivarski commented 4 years ago

Things are far enough along now that this is what I'd recommend:

Step 1: Install Awkward1.

pip install awkward1

Step 2: Convert your Awkward0 array into an Awkward1 array.

>>> import awkward1 as ak
>>> energies = ak.fromawkward0(from_uproot)

See ak.fromawkward0 and ak.toawkward0.

For the sake of argument, let's say that we have an array like this:

>>> energies = ak.Array([[3.5, 2.2, 6.4],
...                      [],
...                      [8.3, 1.5],
...                      [2.8],
...                      [1.9, 4.3, 6.5, 9.0]])
>>> energies
<Array [[3.5, 2.2, 6.4], ... 1.9, 4.3, 6.5, 9]] type='5 * var * float64'>

Step 3: Just pass it into Numba as an ordinary variable. This works for all data types.

I'm not sure if Numba supports zip or unpacking tuples in the for statement; that might have been what was causing trouble in your example. Numba code should generally be written in a very imperative style; that's what it exists for—imperative Python without the performance penalty. Although I think that np.sum works in Numba, you should keep in mind that all NumPy functions in Numba are reimplementations and not necessarily as good as NumPy's own: "as good" in performance but also in coverage—in some cases, arguments aren't handled. I find that I'm forced to express the dtype of an array differently than I normally would outside of Numba (as a positional argument, rather than a keyword, if I remember right).

>>> import numpy as np
>>> import numba as nb
>>> @nb.njit
... def total_energy_per_event(energies):
...     output = np.empty(len(energies))
...     k = 0
...     for event_energies in energies:
...         output[k] = 0.0
...         for particle_energy in event_energies:
...             output[k] += particle_energy
...         k += 1
...     return output
... 
>>> total_energy_per_event(energies)
array([12.1,  0. ,  9.8,  2.8, 21.7])

Tracking that extra index k for the output array is fragile; perhaps it should be part of the loop.

>>> @nb.njit
... def total_energy_per_event(energies):
...     output = np.empty(len(energies))
...     for k in range(len(energies)):
...         output[k] = 0.0
...         for particle_energy in energies[k]:
...             output[k] += particle_energy
...     return output
... 
>>> total_energy_per_event(energies)
array([12.1,  0. ,  9.8,  2.8, 21.7])

More generally, we'll sometimes want array output that is more complex than NumPy, so I'd recommend using Awkward's ak.ArrayBuilder, which fills arrays with append-only semantics.

>>> @nb.njit
... def total_energy_per_event(energies, output):
...     for event_energies in energies:
...         total = 0.0
...         for particle_energy in event_energies:
...             total += particle_energy
...         output.append(total)
... 
>>> output = ak.ArrayBuilder()
>>> total_energy_per_event(energies, output)
>>> output.snapshot()
<Array [12.1, 0, 9.8, 2.8, 21.7] type='5 * float64'>

In this case, it's the same as the flat array, but if you wanted some nested structure:

>>> @nb.njit
... def square_each_energy(energies, output):
...     for event_energies in energies:
...         output.beginlist()     # like print "["
...         for particle_energy in event_energies:
...             output.append(particle_energy**2)
...         output.endlist()       # like print "]"
... 
>>> output = ak.ArrayBuilder()
>>> square_each_energy(energies, output)
>>> output.snapshot()
<Array [[12.2, 4.84, 41], ... 18.5, 42.2, 81]] type='5 * var * float64'>

The data type of the output array is determined by when each of the beginlist/endlist/etc. methods are called, so you can make arbitrary data types. (You can also make a mess by not closing the lists with paired begin and end: just a heads-up.) See the unfinished documentation for a list of these methods.

The basic constraint in using Awkward1 with Numba is that ak.Array and ak.ArrayBuilder can't be created inside a Numba-compiled function. ak.Array is read-only and ak.ArrayBuilder is write-only. Also, ak.ArrayBuilder works entirely by side-effect.

I'm thinking of wrapping this in a functional interface to avoid pitfalls like forgetting to pair each begin with an end. There are also cases like filter, in which it would be more efficient to create a normal NumPy array of booleans and mask the original array, rather than constructing a new one from scratch with ak.ArrayBuilder. Such a layer would know when to use which low-level trick.

The ak.ArrayBuilder might then become a low-level interface, a back door to do some things that would be difficult any other way. For example, we can build nested lists without even knowing in advance how deep they'll be:

>>> @nb.njit
... def deepnesting(builder, probability):
...     if np.random.uniform(0, 1) > probability:
...         builder.append(np.random.normal())
...     else:
...         builder.beginlist()
...         for i in range(np.random.poisson(3)):
...             deepnesting(builder, probability**2)
...         builder.endlist()
... 
>>> deepnesting(builder, 0.9)
>>> builder.snapshot()
<Array [... 1.23, -0.498, 0.272], -0.0519]]]] type='1 * var * var * union[var * ...'>

>>> ak.tolist(builder.snapshot())
[[[[2.052949634260401, 0.9522057655747124], [[[0.2560810133948006], 1.8668954120287653, 0.8933700720920406, 0.31709173110067773], 0.38515995466456676, -1.6259655150460695, [[0.18211022402412927], 0.46592679548320143, 0.39275072293709223], [-0.572569956850481, 1.3991748897028693, -0.15414122174138611, -0.20008742443379549]], [[[-0.7410750761192828, -0.34455689325781347], -0.8446675414135969], [-0.8139112572198548, -0.7250728258598154, -0.42851563653684244, [1.0498296931855706, 1.6969612860075955, -0.18093559189614564, 1.078608791657082]]], [[0.5172670690419124]]], [[-1.9731106633939228, 0.5778640337060391], [-1.2488533773832633, -2.1458066486349434, -0.5439318468515132, [[0.2419441207503176, -2.313974422156488, [-0.6811651539055098, 0.08323572953509818], 1.801261721511669, 0.16653718365329456], -0.6348811801078983, [0.016350096268563003, [-1.2867920376687112, 0.38205295881313484, 1.4093210810506318, -0.2698869943849985, -0.48804922126979045]]], -0.6297773736098737, -2.5333506573111424], [-1.6680144776019314, 0.5862818687707498]], [0.6266171347177766, [[-0.7660737060966999, -0.677432480564727, -1.1527197837522167], -0.5025371508398492, [0.3610998752041169, 0.4811870365139723, -0.8030689233086394, [1.1538103888031122, -1.0955905747145644], -1.3980944016010062, 1.2822990047991039]], 0.939566155023095, [1.3581048298505891, [0.36949478822799947, 1.096666130135532, -0.2769024331557954, -0.7993215902675834], [-0.4103823967097248], [0.6789480075462166, 0.8991579880810466, 0.7900472554969632]], [], [0.6772644918729233, [-0.48385354748861575, -0.39154812719778437], 1.069329510451712, 0.8057750827838897, -0.3440192823735095], [[1.5687828887524105, -1.6086288847970498, [-0.6907842744344904], -0.42627155869364414], 0.33605387861917574, -0.7329513818714791, 0.5040026160756554, -1.2529377572694538, -1.1566264096307166], [[0.6407540268295862], [-0.017540252205401917], -0.9530971110439417], [[0.41643810453893765, -0.682997865214066, 0.7930286671567052], 0.5142103949393788]], [[0.6271004836147108, [0.5895664560584991, -0.7563863809912544]], [1.6176958047983054, 0.5226854288884638, 0.24149248202497436], -1.0912185170716135, [-1.1122535648683918], 0.22727974012353094], [-0.4161362684360263, [[0.4234696267033054], 0.7866791657813567, [1.225201951430818, -0.49790730839958713, 0.2715010029532568], -0.051866117232298316]]]]

>>> ak.typeof(builder.snapshot())
1 * var * var * union[var * union[float64, var * union[var * union[float64, var * float64], float64]], float64]

I'm working on documentation now. At least the ak.Array class is fully documented, which can get you started. Also, the doxygen for C++ is more complete.

HDembinski commented 4 years ago

Thank you for this great answer! I will try this, I hope it is faster than my current workaround. It certainly simplifies the code a lot.

To make blocks with the ArrayBuilder, I would offer a context manager, like so:

>>> @nb.njit
... def square_each_energy(energies, output):
...     for event_energies in energies:
...         with output.nested():
...             for particle_energy in event_energies:
...                 output.append(particle_energy**2)
... 
>>> output = ak.ArrayBuilder()
>>> square_each_energy(energies, output)
>>> output.snapshot()
<Array [[12.2, 4.84, 41], ... 18.5, 42.2, 81]] type='5 * var * float64'>

nested is probably not the best possible name.

jpivarski commented 4 years ago

A context manager is a great idea! I think list would be a good name (i.e. remove the begin and end from beginlist, begintuple, and beginrecord).

However, it would only work outside of Numba. Numba doesn't support with statements inside lowered code: https://numba.pydata.org/numba-doc/dev/reference/pysupported.html (It's supported in object mode, but that doesn't help if you want to njit.)

jpivarski commented 4 years ago

Oh, one warning: Awkward1's minimal version of Numba will be 0.49 (see PR scikit-hep/awkward-1.0#189) as soon as Numba 0.49 becomes a non-release candidate in PyPI. The first release candidate for Numba 0.49 was released seconds ago:

https://numba.pydata.org/numba-doc/dev/release-notes.html

So if you start to use this, you'll have to adjust your versions of both Awkward1 and Numba in a couple of weeks, however long it takes for 0.49 to be a non-release candidate. (Numba 0.49 introduces major breaking changes in its extension interface, partly motivated by Awkward's heavy use of it and need for a well-defined API, and I'd rather make a fresh start at 0.49 than support both.)

HDembinski commented 4 years ago

Ah, I didn't know about this numba limitation (one of many...). Too bad :( Context managers are great.

This perhaps becomes a bit off-topic, but I did some benchmarking of my workaround code against awkward1. I find that my workaround is a factor 6 faster than the version using ArrayBuilder. :(

In the following x is some JaggedArray.

@njit
def _fun(starts, stops, content):
    result = np.empty(1) # let's pretend we don't know size of result in advance
    for i, (a, b) in enumerate(zip(starts, stops)):
        sub = content[a:b]
        if i == len(result): # amortized growth, like std::vector
            m = int(len(result) * 0.5 + 1)
            result = np.append(result, np.empty(m))
        xsum = 0.0
        for xi in sub:
            xsum += xi
        result[i] = xsum
    return result[:i]

def fun(x):
    return _fun(x.starts, x.stops, x.content)

This takes 18 microseconds.

Now the version that is using an awkward1 array:

@njit
def _fun(x, builder):
    for sub in x:
        xsum = 0.0
        for xi in sub:
            xsum += xi
        builder.append(xsum)

def fun(x):
    builder = ak.ArrayBuilder()
    _fun(x, builder)
    return builder.snapshot()

This takes 115 microseconds.

jpivarski commented 4 years ago

ArrayBuilder is dynamically typed, so if you know the type of the thing you want to build, you should (always?) be able to do it faster by building up starts and stops than with ArrayBuilder. (There's a nascent ROOT std::vector^N<T> → Array function that avoids ArrayBuilder because the type is partially known.)

I mentioned ArrayBuilder as a general case: it's something that you usually wouldn't be able to do at all in a Numbafied function, and it's faster than doing it in interpreted Python and converting the result to an array.

That's also why I'm thinking about a functional interface to choose between ArrayBuilder and a faster method if one is available (like filter, which I mentioned). Knowing all these details about what is faster than what is not something we want users to have to deal with, at least not all the time.

HDembinski commented 4 years ago

It would be great if there was a faster way offered by the awkward1 API. filter does not work for a general use case where you need to build a jagged array inside the numba-accelerated function.

I have a general use case. I have a tree that holds a variable number of particles per event. I need to loop over these events and find all track pairs per event, which have a small distance of closest approach (these are candidates for a particle decay where the original particle is neutral and invisible to the detector). It is not possible to predict how many pairs per event there are, and the number of pairs per event is variable. So the right data structure for that would be a jaggedarray that I can grow dynamically. Computing something for all unique pairs of tracks is very time-consuming and has to be done inside numba. It would be great to have a fast solution for that (other than my manual workaround).

HDembinski commented 4 years ago

https://numba.pydata.org/numba-doc/latest/extending/high-level.html#implementing-methods

It looks like you could provide an overload for ArrayBuilder.append that picks a non-virtual implementation based on the type that is passed in append.

jpivarski commented 4 years ago

It looks like you could provide an overload for ArrayBuilder.append that picks a non-virtual implementation based on the type that is passed in append.

I already do:

https://github.com/scikit-hep/awkward-1.0/blob/51e3de54f1a565c7ed628d99b70ddbc32ea5903f/src/awkward1/_connect/_numba/builder.py#L250-L296

This logic picks a specialized method during the typing pass.

However, passing a non-virtual method here doesn't matter much because ArrayBuilder is virtual at every level: when you call a method, it might need to refine its type, which means walking down a tree of class instances with virtual functions. If I had known that your application prioritized performance this much, I would not have recommended ArrayBuilder: it's for generality.

Nevertheless, walking over the read-only ak.Array in Numba ought to be fast. It's the append-only ak.ArrayBuilder that is slow because it it's general and it doesn't know what type it's going to get. ak.Array already knows what type it is and generates specialized code for that. (Yet another example of read/write asymmetry.)

If you want to write arrays quickly, then we have to consider things on a case-by-case basis. If your application requires a flat array output with length known ahead of time, output a NumPy array. (I think that was the first example code.)

jpivarski commented 4 years ago

I have a general use case. I have a tree that holds a variable number of particles per event. I need to loop over these events and find all track pairs per event, which have a small distance of closest approach (these are candidates for a particle decay where the original particle is neutral and invisible to the detector). It is not possible to predict how many pairs per event there are, and the number of pairs per event is variable. So the right data structure for that would be a jaggedarray that I can grow dynamically. Computing something for all unique pairs of tracks is very time-consuming and has to be done inside numba. It would be great to have a fast solution for that (other than my manual workaround).

This application doesn't sound like it even needs Numba: there's ak.choose for creating combinations of arbitrarily sized subcollections. As you can see, I haven't written the documentation for that yet, but maybe if you show an imperative example, I can suggest the vectorized translation.

jpivarski commented 4 years ago

Or maybe we should move to Gitter while we iterate and just post the solution here.

nsmith- commented 4 years ago

@HDembinski was energies.sum() really too slow even in awkward0 that a numba function was necessary? I've never found it to be too bad. Also, regarding the pairs of tracks, was it memory-prohibitive or otherwise untractable to use the awkard0-style tracks.pairs()? I realize that clearly these are not optimal but I'm wondering how big of a performance hit you saw with your use case.