oschulz / PropertyFunctions.jl

Julia package for easy access to and efficient broadcasting over properties
Other
6 stars 0 forks source link

Generalizations and extensions #4

Open aplavin opened 9 months ago

aplavin commented 9 months ago

I think the approach taken in this package is a great idea! Let me start some discussion about a somewhat wider picture.

This was a piece I was missing for quite some time. There has already been a way to define "introspectable" property-getter functions that can in principle be optimally applied to columnar arrays – Accessors.PropertyLens(:propname). With them, stuff like map((@optic _.a.b + 1), structarray) could be made performant in a trivial way. However, this is not that useful really, so such map methods weren't even implemented.

PropertyFunctions.jl took it to an actually useful level, making it so that applying more complex functions (like @pf $a + $c^2) to structarrays only accesses relevant data. I recently noticed this package, and took the liberty to generalize/adapt its features – for now they are in AccessorsExtra.

These features are quite new and not mature yet: eg filter should also be defined, similar to how you do it here; FlexiJoins support would also be nice. But I quite like how it plays out already. Please let me know what you think about these and other potential generalizations, or about relevant design decisions! Potentially, it would be nice to have PropertyFunction itself defined here with all relevant functions/@pf/overloads and with no dependencies, while AccessorsExtra would just handle its construction in the @optic macro in a consistent way.

oschulz commented 9 months ago

Hi @aplavin , thanks for your kind words!

PropertyFunctions actually grew out of data-processing use cases I have that involve tables with columns which are expensive to access, so I wanted to make it easy for users to broadcast simple functions that touch only what's necessary (I actually don't have it so bad, ask @moelf for data that can have thousands of columns, only a small subset of which may need to be accessed at a time). The fun thing is that it was quite easy to make PropertyFunctions work on both GPU and GPU, at least with StructArrays. The reason for using "$a" instead of something like ".a" is was to make the expression as compact as possible for the user ($a + $c^2 may perhaps be more readable than ` .a + _.c^2`, but the Accessor-syntax is of course more flexible in general).

I didn't have any major ambitions beyond the current state originally, but I like your ideas very much. And nested property access is definitely great to have, as well as having more advanced operations (grouping, etc.). I'd be very open to extend/generalize in conjunction with the "Accessor-verse". Beyond adding this kind of functionality to Accessors(-Extra), do you think we could provide some integration? For example, could we make Accessors use/convert PropertyFunctions generated with @pf, e.g. via a package extension here or there?

Moelf commented 9 months ago

I forgot about this package sorry Oliver, but I like the discussion I'm seeing here, specifically two things I resonate with a lot:

the first is important because our columns can be disk-backed and loading every column is silly when user may only need 10% of them. The second I recently started looking into because there are interests in having query-like way doing things as oppose to loop-style, and in the process, I realized Query.jl is "lazy" but you're limited to a handful of operations and also the ergonomics is not the best.

I have a concrete example for people who are interested: https://github.com/Moelf/UnROOT_RDataFrame_MiniBenchmark/blob/8492a186d16504b5a510ad1ef007c68a4f11e004/composite_benchmarks/UnROOT_loop.jl#L13

and the corresponding Query.jl style looks like this:

julia> function query_main(t)
           @from evt in t begin
               @where evt.nMuon == 4
               @let pts = evt.Muon_pt
               @let etas = evt.Muon_eta
               @where all(pts .> 5) && all(abs.(etas) .< 2.4)
               @let phis = evt.Muon_phi
               @let masses = evt.Muon_mass
               @let charges = evt.Muon_charge
               @where sum(charges) == 0
               @let Z_idx = reco_zz_to_4l(pts, etas, phis, masses, charges)
               @where filter_z_dr(Z_idx, etas, phis)
               @let  Z_mass = compute_z_masses_4l(Z_idx, pts, etas, phis, masses)
               @where filter_z_candidates(Z_mass)
               @let h_mass = compute_higgs_mass_4l(Z_idx, pts, etas, phis, masses)

               @select h_mass
               @collect
           end
       end

and the run time is similar. I'm sure we can try to use PF.jl too, but my point is I'm generally intereted in this (design) space of optimal access & optimal execution for basic-ish user-defined data crunching

oschulz commented 9 months ago

I forgot about this package sorry Oliver, but I like the discussion I'm seeing here, specifically two things I resonate with a lot

My fault, Jerry, I forgot to talk to you about it at the JuliaHEP meeting. I always wanted to try out PropertyFunctions with UnROOT, just hadn't gotten around to it yet ... I might just work out of the box, and if not it probably won't require much to make it work.

Moelf commented 9 months ago

it probably does, but it alone doesn't provide a lot for users. Since Julia added the destructing syntax, the user-written event loops probably look like:

for evt in tree
    (; col1, col2, col3) = tree
    # use col1, col2, col3
end

which is quite compact already.

oschulz commented 9 months ago

the user-written event loops probably look like:

Oh, sure - I thought more in the context of broadcast syntax, like

@pf($col1 + $col2).(tree)

or

filterby(view, @pf $col1 + $col2 < $col10)(tree)
Moelf commented 9 months ago

right, these are nice but doesn't fit into the workflow -- would user really pick up @pf and drop it later when the analysis needs evolve? and learning @pf as oppose to just stick with Base Julia also adds mental burden to user.

These are the design-space problem I'm struggling with I guess, I don't want user to re-write using different Julia package/stype when going from "simple inspection" to "full analysis" and I don't see @pf being part of the recipe right now

oschulz commented 9 months ago

Well, I was thinking more about casual use cases ("let's quickly plot that for this tree") than larger-scale analysis workflows.

Moelf commented 9 months ago

for that I can see user opt for the vanilla Julia stuff just because it doesn't require learning a new package:

@pf($col1 + $col2).(tree)
# 
(x -> x.col1 + x.col2).(tree)
filterby(view, @pf $col1 + $col2 < $col10)(tree)
# 
filter(tree) do evt
    evt.col1 + evt.col2 < evt.col10
end

I'm not saying the perfect use case doesn't exist, but seems to be a small phase space to me that's both:

oschulz commented 9 months ago

Yes, but (x -> ...).(tree) will unfortunately (except in very simple where the compiler can optimize it away) load all rows. ;-)

aplavin commented 9 months ago

PropertyFunctions actually grew out of data-processing use cases I have that involve tables with columns which are expensive to access

In my usecases, columns typically aren't that expensive – but where there are a lot of them it adds up. Columns can be eg memmaped or lazy arrays.

Another significant area where this is useful is very wide tables that are better as type-unstable containers. I made a DictArrays.jl package, but before encountering PropertyFunctions I wasn't sure it can be worked with conveniently and performantly. DictArray is basically StructArray but with a dictionary instead of a namedtuple, so fundamentally type unstable – but feasible for hundreds+ of columns. Before PropertyFunctions, I wrote a hacky fast map implementation for DictArrays: neat hack btw, it traces all property accesses and make latter calls type-stable. It does work, but isn't really composable.

Now, utilizing the approach pioneered in this package, stuff like map((@o _.a.b + 1 + _.c), StructArray(a=DictArray(b=...), c=...)) works fast, and for arbitrary nesting of these array types! It is crucial for being able to use DictArrays reliably, at least for me.

ask @Moelf for data that can have thousands of columns

Interesting, do you use type-unstable containers for that? If so, what functions do you use to work with them performantly? I don't remember seing tables with more than a few hundred columns myself.

This scenarios sounds like a great usecase for "propertyfunctions", similar to the DictArrays integration.

would user really pick up @pf and drop it later when the analysis needs evolve?

IMO seems a bit strange to use a heavy macro-based package like Query.jl, but be hesitant to adopt a small self-contained @pf or @o macro. They work with basic data manipulation function like map and filter that everyone is familiar with (I mean, they can work – not everything is implemented). And the change from pure Base Julia is just

# from:
map(x -> (a=x.a, c=x.b + f(x.c)), X)
# to (Accessors syntax):
map((@o (a=_.a, c=_.b + f(_.c))), X)

I'm biased by being also a heavy DataPipes.jl user :), so for me it's just

# from:
@p X |> map((a=_.a, c=_.b + f(_.c)))
# to – just add @o:
@p X |> map(@o (a=_.a, c=_.b + f(_.c)))

The reason for using "$a" instead of something like ".a" is was to make the expression as compact as possible for the user ($a + $c^2 may perhaps be more readable than .a + _.c^2, but the Accessor-syntax is of course more flexible in general).

Yeah, by far the main reason for the _ syntax in this case is generality and that it already exists.

For example, could we make Accessors use/convert PropertyFunctions generated with @pf, e.g. via a package extension here or there?

First I though about just using the type from PropertyFunctions.jl in AccessorsExtra, but it didn't work out for two reasons:

aplavin commented 9 months ago

Regarding interfacing with other columnar collection types: I like how easy these intergrations can be made! These is the only definition I needed to support StructArrays:

extract_properties_recursive(x::StructArray, props_nt::NamedTuple) =
    StructArray(extract_properties_recursive(StructArrays.components(x), props_nt))

same with DictArrays:

extract_properties_recursive(x::DictArray, props_nt::NamedTuple{KS}) where {KS} =
    extract_properties_recursive(x[Cols(KS)], props_nt)
oschulz commented 9 months ago

Really not looking forward to add StructArrays and Tables as dependencies there, wish they weren't here either.

Me too! :-) I didn't find a way to get what I wanted (GPU support and output of StructArrays even when broadcasting over something like DataFrame columns) without a directly dependency, but it certainly makes PropertyFunctions heavier than I'd like.

oschulz commented 9 months ago

@aplavin Maybe we can have a Base.convert between PropertyFunction and the Accessors equivalent? It could live in a package extension in PropertyFunctions (since Accessors is the more central resp. widely used package). Could PropertyFunctions be convertible to Accessors lenses, would that make sense?

oschulz commented 9 months ago

@aplavin: sortby and filterby - I added these here because we don't seem to quite such a sort- and filter-API anywhere without heavy dependencies. Would this be something you'd like to have in Accessors? If so, we could move then in a more central place, or even move them into Accessors and have PropertyFunctions depend on Accessors?

aplavin commented 9 months ago

It could live in a package extension in PropertyFunctions (since Accessors is the more central resp. widely used package).

I can understand the confusion :) but none of this functionality is in Accessors proper. Quite some time ago, I created the AccessorsExtra.jl package with functionality that is more experimental/opinionated, or uses Julia internals. Accessors.jl is widely used, so better to keep it stable in these regards. Some parts can (and should!) be upstreamed at some point...

Conversion can be done in both directions, with the inner (opaque) function getting more nested at each conversion: in this package, pf.sel_prop_func takes a Tuple of extracted values, while in AccessorsExtra pf.func takes an object (NamedTuple) with requested properties. Makes nested properties handling easier.

Could PropertyFunctions be convertible to Accessors lenses, would that make sense?

This functionality in AccessorsExtra mostly just piggybacks on the existing syntax, PropertyFunction can only be called and it doesn't support other functionality of lenses – so it's not really a lens. For example, PF cannot be used to modify a part of an object:

julia> obj = (a=1, b=2)
(a = 1, b = 2)

# add 1 to log10(obj.a):
julia> modify(x -> x + 1, obj, (@o log10(_.a)))
(a = 10.0, b = 2)

# clearly not possible with propertyfunction:
julia> modify(x -> x + 1, obj, (@o log10(_.a) + _.b))
ERROR

sortby and filterby - I added these here because we don't seem to quite such a sort- and filter-API anywhere without heavy dependencies.

Hmm, what's wrong with simply defining Base.filter(::PropertyFunction, X)? That's what I'm planning to do, and that's how it works with map. sort is a bit more involved because of kwarg, cannot just define Base.sort(X; by::PropertyFunction). But I think it should be possible to support using some Ordering API in Base... Didn't try though.

oschulz commented 9 months ago

Hmm, what's wrong with simply defining Base.filter(::PropertyFunction, X)

I thought about that, but I wanted to also allow the curried form filterby(pf)(x) to make operations easily chainable. I thought maybe it would be better not to bend the Base.filter API that far, especially since it might then be confusing why filter(f) works with some functions but not others.

oschulz commented 9 months ago

I can understand the confusion :) but none of this functionality is in Accessors proper.

Oh, right! :-)

Conversion can be done in both directions

Great - I'd be happy to make AccessorsExtra a weak-dep of PropertyFunctions to host such conversions.

in this package, pf.sel_prop_func takes a Tuple of extracted values, while in AccessorsExtra pf.func takes an object (NamedTuple) with requested properties. Makes nested properties handling easier.

Yes, I like that approach. One reason why I went for the "flat tuples" was that I hope it will introduce less overhead during automatic differentiation (esp. with Zygote), because things are "unpacked" only once, so no intermediate tangents of NamedTuples with only a few "active fields" (looks like I still need a custom rrule or two to make it actually work, though). But I would like to support $a.b.c in the future ...

so it's not really a lens. For example, PF cannot be used to modify a part of an object:

Oh, of course!

aplavin commented 9 months ago

I wanted to also allow the curried form filterby(pf)(x) to make operations easily chainable

filter(f)(X) works with all functions automatically, there's this definition in Base:

function filter(f)
    Fix1(filter, f)
end

One reason why I went for the "flat tuples" was that I hope it will introduce less overhead during automatic differentiation (esp. with Zygote)

Oh this part I basically know nothing about. I sometimes use ForwardDiff and everything works with it, but reverse autodiff is more involved.

I didn't find a way to get what I wanted (GPU support

Any specific test you have in mind to ensure PF works on GPU? I can try adding it as well, just not familiar with this area.

and output of StructArrays even when broadcasting over something like DataFrame columns

This is clearly not possible without a StructArrays dep. But is such behavior needed/expected? The alternative is to support input collection types in an opt-in manner (falling back to default full materialization). And return the most natural output type for a given input. For StructArrays and DictArrays it is StructArray, for those HEP types maybe the same type with new columns (?), for DataFrames – a DataFrame. Interesting that you mentioned DataFrames, my feeling (as a non-user) has always been that they are very distinct from more regular Julia collections/tables, with their own syntax for most of the operations – don't even support map.

oschulz commented 9 months ago

filter(f)(X) works with all functions automatically, there's this definition in Base:

Yes, unfortunately not on older Julia versions though, for example not on v1.6-LTS, and Compat doesn't backport it either. :-( Not sure when it was added, might have been after I designed PropertyFunctions. Oh and filterby also supports "view-mode", which filter doesn't.

oschulz commented 9 months ago

Any specific test you have in mind to ensure PF works on GPU?

It "just works" as PropertyFunction is written, because broadcasting a function over several flat GPU array just works. And StructArrays does a little BroadcastStyle thing to be GPU-broadcast compatible.

and output of StructArrays even when broadcasting over something like DataFrame columns This is clearly not possible without a StructArrays dep. But is such behavior needed/expected?

I like it because I think the output should be columnar for both @pf((a = ..., b = ...)).(some_table) and @pf((a = ..., b = ...)).(simple_vector). It makes things consistent, and StructArrays is the best solution we currently have, I think. When broadcasting something that produces NamedTuples over a plain vector of numbers, there's nothing in the input that we can dispatch on to decide what output type we should use. So I just go for StructArray in all cases where the output of the function is a NamedTuple. :-)on

StructArrays is a bit like StaticArrays I think - the best representation we have for each use case, and therefore hard to avoid sometimes. Fortunately StructArrays has become more lightweight (though not super-lightweight) over time, in contrast to StaticArrays. :-)

aplavin commented 9 months ago

Yes, unfortunately not on older Julia versions though, for example not on v1.6-LTS, and Compat doesn't backport it either.

Sure then, for earlier Julia versions this filterby stuff is necessary for convenient piping out of the box.

Oh and filterby also supports "view-mode", which filter doesn't.

Yeah I also reach for filter-but-view sometimes. Didn't know where best to put such a function – for now it exists in Skipper.jl, called filterview. Generally useful when working with collections, but doesn't seem that good of a fit to Base IMO...

I like it because I think the output should be columnar for both @pf((a = ..., b = ...)).(some_table) and @pf((a = ..., b = ...)).(simple_vector)

Interesting, I took the opposite approach, and also for consistency :) Namely, the goal is for map((@o ...), X) to return a result that is the same (or as close as possible) as map(..., X) for the corresponding anonymous function. Just with better performance on columnar collections. Then, arrays result in arrays, StructArrays in StructArrays, etc. As a bonus – everything can be done in extensions.

oschulz commented 9 months ago

for now it exists in Skipper.jl, called filterview. Generally useful when working with collections, but doesn't seem that good of a fit to Base IMO...

Yes, I don't think we should stretch the API of Base.filter that far.

oschulz commented 9 months ago

Then, arrays result in arrays, StructArrays in StructArrays, etc

Yes, that work too :-)

Moelf commented 9 months ago

Interesting, do you use type-unstable containers for that? If so, what functions do you use to work with them performantly? I don't remember seing tables with more than a few hundred columns myself.

we use type-stable container, basically we have our own TypedTables.Table, and we roll our own StructArray.LazyRow to delay access inside the loop until user call getproperty(row, ...). Luckily, building the table for 1500 columns only takes ~1s in 1.10.

IMO seems a bit strange to use a heavy macro-based package like Query.jl, but be hesitant to adopt a small self-contained @pf or @o macro.

user may want to learn some ecosystem similar to Query.jl with the hope they can do everything there, like SQL. While @pf @o do not add enough value, but are still macro (not composable).

oschulz commented 9 months ago

while @pf @o do not add enough value, but are still macro (not composable).

I would say it depends very much on the use case. :-)

Moelf commented 9 months ago

yeah I'm just speaking from the very narrow, LHC-centric analysis use case, given that our LazyTree already make broadcast as lazy as possible

aplavin commented 9 months ago

user may want to learn some ecosystem similar to Query.jl with the hope they can do everything there, like SQL

Sounds like the "ecosystem" of Julia collections and related data manipulation functions :) They can do everything there, no need to make every function a macro as Query.jl does. And not possible btw, there are lots of other data manipulation packages that provide other functionality in a manner consistent with Julia itself but not with Query.

While @pf @o do not add enough value, but are still macro (not composable)

"Not enough value" is subjective, sure if one doesn't care about new features/optimizations provided by them – no point to use them. But "not composable"?! That's very surprising to hear. Do you have any specific examples of that?

Moelf commented 9 months ago

"not composable" just because it's macro, by definition less composable

knock yourself out with example:

it seems to me either I just write for-loop or I'm doing column-at-a-time quick study. For former I can't use PF.jl, for latter PF.jl is not so much better than just vanilla Julia functions

Moelf commented 9 months ago

I want to stress by no means this is a problem or deficiency of PF.jl -- this is just generally a hard problem.

There doesn't exist a design pattern or technology that can organically combine loopy code and columnar kernel anywhere. One can try to start from either side, e.g. Python people they start from the columnar side, and then they run into hard-to-express computation, they may then write a Numba/Jax kernel, but that kernel may be as complex as your loop-body.

In Julia we naturally start from the loopy code, but then one just never bother with PF.jl

aplavin commented 9 months ago

"not composable" just because it's macro, by definition less composable

"Less" composable – maybe, "not" composable – clearly not. IMO, this approach fares much better in the composability regard than for loops.

https://github.com/Moelf/ADLBenchmark.jl

Ok I took a random short example from there:

        using ...

    h = Hist1D(; bins = 0:2:200)
    for evt in tree
        count(>(40), evt.Jet_pt) < 2 && continue
        push!(h, evt.MET_pt)
    end

and it's a good one to use the ProperyFunction approach. The code could look like

using DataPipes, PropertyFunctions

h = @p let
    tree
    Iterators.filter(@pf count(>(40), $Jet_pt) ≥ 2)
    Iterators.map(@pf $MET_pt)
    Hist1D(bins=0:2:200)
end

and be efficient with some interop support between PF and your data container package.

This reads much cleaner than for loop with push!es somewhere within.

Moelf commented 9 months ago

it's good to have examples, so this is what I mean, users start out looking at one histogram. Eventually it will evolve into looking at O(10)xO(100) histograms, the first O(10) are different "signal regions", the second O(100) is how many different systematic uncertainty you deal with: https://github.com/Moelf/WVZAnalysis.jl/blob/master/src/mainlooper.jl#L251-L297

for maximal efficiency, we want to loop the data only once of course, so you need to have push!() in the middle of things, which breaks the DataPipes way of thinking about things.


Anyway, this is not a problem of PF.jl, it's a hard engineering/design issue that nobody has good answer for, here are three languages (frameworks) doing the same semi-realistic analysis trying to figure out what are some good patterns:

aplavin commented 9 months ago

we want to loop the data only once of course, so you need to have push!() in the middle of things, which breaks the DataPipes way of thinking about things

Ehm, why does it "break" anything? Iterators.map/filter are specifically designed to make single-pass processing, and I'm pretty sure they can be made to work efficienly with PF.

That linked code is way too involved for me to understand and translate, but it should be a pretty safe bet that going from for-loops to higher-level composable functions would make it cleaner :)

Anyway, this is not a problem of PF.jl, it's a hard engineering/design issue that nobody has good answer for

Sounds like a great case for Julia and packages, solve a design issue nobody had good answers for.

Moelf commented 9 months ago

Ehm, why does it "break" anything? Iterators.map/filter are specifically designed to make single-pass processing, and I'm pretty sure they can be made to work efficienly with PF.

my understanding is DataPipes.jl assumes push-down all the way, in the same way. You're looking at branching push!() and it's not ergonomic to accumulate 1000 numbers every iteration and push everything in the end, put it another way, our event loops are not (in the general sense) SIMD friendly, you accumulate to different subset of histograms based on the content of each iteration.


Sounds like a great case for Julia and packages,

that's the goal

oschulz commented 9 months ago

In Julia we naturally start from the loopy code, but then one just never bother with PF.jl

I wouldn't subscribe to that as a given. :-) Sure, in Julia we can and do write fast loops, but that doesn't mean we have to write loops all the time, e.g. when broadcasts and similar will do the trick. ;-)

Moelf commented 9 months ago

again, if you look at the analysis loop you will know why...

Unless someone comes up with coffea or RDataFrame, we're not going to do analysis in pure columnar style. And given you know you will write a loop in the end, why bother start with columnar style knowing you will delete those code later?

idk, LHC analysis sucks I guess? I wouldn't oppose that conclusion

aplavin commented 9 months ago

Higher-level code is so much easier to understand (compared to for loops) in other fields of data analysis, that one can be quite confident it is also the case in HEP. An issue can be just that there is no existing implementation that would allow you to write such code performantly.

Setting performance and laziness aside for a moment, can you imagine and show how ideal (non-for-loop) code would look like for some challenging but short example analysis? As in, "oh that's how I'd like to write my code, if it ran fast".

Moelf commented 9 months ago

Higher-level code is so much easier to understand (compared to for loops) in other fields of data analysis, that one can be quite confident it is also the case in HEP.

https://github.com/CoffeaTeam/coffea/discussions/830#discussion-5268678 idk, as a reader, I have no idea what those gymnastics are doing, as a writer, this post is a Postdoc struggling to come up with the correct gymnastics for what they want.

You're basically saying Numpy > Julia for loops, but not everything is easy to express in "columnar style", mind you, in HEP your columns are often semantically Vector{Vector{some_struct}}

aplavin commented 9 months ago

You're basically saying Numpy > Julia for loops

Sorry, wat? I'm saying that map/filter/... + their lazy counterparts are better than for loops in most regards. No praising of rigid python vectorized/numpy style here at all, was glad to switch to Julia and avoid all of those.

Moelf commented 9 months ago

For example, I think these both suck, but at least in the loopy version, it's easy for a student to go in and change something about this algorithm

        events["Electron", "pdgId"] = -11 * events.Electron.charge
        events["Muon", "pdgId"] = -13 * events.Muon.charge
        events["leptons"] = ak.concatenate([events.Electron, events.Muon], axis=1,)
        events = events[ak.num(events.leptons) >= 3]

        pair = ak.argcombinations(events.leptons, 2, fields=["l1", "l2"])
        pair = pair[(events.leptons[pair.l1].pdgId == -events.leptons[pair.l2].pdgId)]
        with np.errstate(invalid="ignore"):
            pair = pair[
                ak.singletons(
                    ak.argmin(
                        abs(
                            (events.leptons[pair.l1] + events.leptons[pair.l2]).mass
                            - 91.2
                        ),
                        axis=1,
                    )
                )
            ]
        events = events[ak.num(pair) > 0]
        pair = pair[ak.num(pair) > 0][:, 0]

        l3 = ak.local_index(events.leptons)
        l3 = l3[(l3 != pair.l1) & (l3 != pair.l2)]
        l3 = l3[ak.argmax(events.leptons[l3].pt, axis=1, keepdims=True)]
        l3 = events.leptons[l3][:, 0]

        mt = np.sqrt(2 * l3.pt * events.MET.pt * (1 - np.cos(events.MET.delta_phi(l3))))
               leptons = Vcat(
            LorentzVectorCyl.(evt.Electron_pt, evt.Electron_eta, evt.Electron_phi, evt.Electron_mass),
            LorentzVectorCyl.(evt.Muon_pt, evt.Muon_eta, evt.Muon_phi, evt.Muon_mass)
        )
        nLep = length(leptons)
        nLep < 3 && continue
        lep_pids = Vcat(evt.Electron_charge .* 11, evt.Muon_charge .* 13)

        sel_combo = [1, 1]
        best_diff = Inf

        for i in 1:nLep
            i_pid = lep_pids[i]
            i_lep = leptons[i]
            for j in 1+i:nLep
                i_pid != -lep_pids[j] && continue
                temp_mass = fast_mass(i_lep, leptons[j])
                temp_diff = abs(temp_mass - Z_m)
                if temp_diff < best_diff
                    best_diff = temp_diff
                    sel_combo .= i, j
                end
            end
        end

        isinf(best_diff) && continue # no Z candidate found

        i_lep3 = argmax(eachindex(leptons)) do i
            i ∈ sel_combo && return -Inf32
            return pt(leptons[i])
        end
        l3 = leptons[i_lep3]

        mt = sqrt(2 * l3.pt * evt.MET_pt * (1.0 - cos(evt.MET_phi - l3.phi)))
Moelf commented 9 months ago

Sorry, wat?

saying high-level code is easier to understand is like saying Numpy functions are easier to understand. when we're talking about a function like sum() of course it's better than writing a for loop. But, here you don't have much choice, if you want to use Iterator.map() or whatever, you're enforcing a SIMD paradigm, where your entire loop-body is a "kernel", but that doesn't actually make anything easier.

aplavin commented 9 months ago

in HEP your columns are often semantically Vector{Vector{some_struct}}

These are quite nice to work in Julia actually! When your data is "small" (so that all operations can be materalizing), all functionality to do that already exists. This is a major area where I feel much better than using Python. And no for-loops either :)

Probably it's not fully ready for huge/lazy datasets, but it's not a fundamental limitations.

Moelf commented 9 months ago

These are quite nice to work in Julia actually! When your data is "small" (so that all operations can be materalizing), all functionality to do that already exists. This is a major area where I feel much better than using Python. And no for-loops either :)

sorry, maybe you're not understanding the big picture, I'm saying that's 1 column, and your Table has 1000 columns like that, and your "analysis" is to loop over all these columns once (because you want to minimize things), while doing row-level computation

aplavin commented 9 months ago

saying high-level code is easier to understand is like saying Numpy functions are easier to understand

FYI, it becomes very hard to follow when you start with these questionable, and I would say clearly incorrect, assertions.

Moelf commented 9 months ago

Higher-level code is so much easier to understand (compared to for loops) in other fields of data analysis

I'm saying this sentence is not super meaningful, by definition high-level code is easier to understand, but if you actually try to do a "HEP data analysis", you will find you can't express your problem in the nice paradigms we're familiar with.

The "data analysis" is fundamentally "row-level computation", and your "row" is hundreds of vectors of structs

aplavin commented 9 months ago

sorry, maybe you're not understanding the big picture, I'm saying that's 1 column, and your Table has 1000 columns like that

Sure, that part I also encounter quite often – and julia makes it convenient without for loops.

your "analysis" is to loop over all these columns once (because you want to minimize things), while doing row-level computation

Please fully read the message you are responding to. There is probably no readily implemented stuff for lazy/singlepass operations like these, but there are no fundamental limitations. That it already works great for "small"/materialized datasets is a great indication.

Moelf commented 9 months ago

There is probably no readily implemented stuff for lazy/singlepass operations like these, but there are no fundamental limitations.

there is, Query.jl does everything in a single pass whenever possible, but it's quite limited for many reasons as we already knew. So again, unless we have Coffea or RDataFrame re-implemented in Julia, it's just not practical to do anything other than for-loop -- given you know you're doing a full, serious HEP analysis at the end of the day

Moelf commented 9 months ago

sorry, I'm not being helpful nor contributing to the original topic constructively, we should move on.

oschulz commented 9 months ago

I think this have gone a bit beyond the scope of this issue. @Moelf does indeed have use cases where it's kind of hard to achieve the required "loop inversion" (doing many different things for the same data with different output, and doing all of that for each bit of data) automatically. But on the other hand, in many use cases (I'd say also quite a few in HEP, depends on the level of the analysis), columnar/broadcast-style will do fine. The nice thing in Julia is that we can do whatever fits the use case best, without worring too much. :-)