Major performance loss by indexing using symbols.

Deduction42 commented 4 years ago

I tried using this package to try to make my code cleaner when indexing arrays, but as soon as I started using it, the performance dropped considerably. I ran a few benchmarks which I found disappointing, particularly for the claim "The LArrays are fully mutable arrays with labels. There is no performance loss by using the labelled instead of indexing."

v = zeros(3)
@time for ii in 1:100000
    n = v[1]*1
end
>> 0.004284 seconds (200.00 k allocations: 3.052 MiB)

v = @LArray zeros(3) (:a,:b,:c)
@time for ii in 1:100000
    n = v[1]*1
end
>> 0.004083 seconds (200.00 k allocations: 3.052 MiB)

v = @LArray zeros(3) (:a,:b,:c)
@time for ii in 1:100000
    n = v[:a]*1
end
>> 0.344170 seconds (300.00 k allocations: 4.578 MiB)

v = @LArray zeros(3) (:a,:b,:c)
@time for ii in 1:100000
    n = v.a*1
end
>> 0.345015 seconds (300.00 k allocations: 4.578 MiB)

This is done after multiple trials (to ensure we're not including JIT compile time). It looks like indexing by label is 100x times slower than indexing by number (but indexing an LArray is just as fast as a regular array when indexing by number). Is this a bug, am I using this wrong, or is the zero performance from labelled indexing no longer an objective?

ChrisRackauckas commented 4 years ago

You're timing at the top level scope. If you put it into a function it would compile away.

Deduction42 commented 4 years ago

Well it appears that it only compiles away when you actually use the .field notation. I can't seem to iterate through these objects over the fieldnames (such as with "getproperty/setproperty") without incurring heavy performance losses.

function f()
    v = @LArray zeros(3) (:a,:b,:c)
    for ii in 1:100000
        for k in (:a, :b, :c)
            v[k] = v[k] + 1
        end
    end
    return v
end
@time f()
>> 1.851394 seconds (1.20 M allocations: 18.311 MiB, 0.13% gc time)

function f()
    v = @LArray zeros(3) (:a,:b,:c)
    for ii in 1:100000
        for k in (:a, :b, :c)
            setproperty!(v, k, getproperty(v, k) + 1)
        end
    end
    return v
end
@time f()
>> 1.862027 seconds (1.20 M allocations: 18.311 MiB, 0.11% gc time)

function f()
    v = @LArray zeros(3) (:a,:b,:c)
    for ii in 1:100000
        v.a = v.a + 1
        v.b = v.b + 1
        v.c = v.c + 1
    end
    return v
end
@time f()
>> 0.000206 seconds (1 allocation: 112 bytes)

ChrisRackauckas commented 4 years ago

I can't seem to iterate through these objects over the fieldnames (such as with "getproperty/setproperty") without incurring heavy performance losses.

Yes, that is expected. It can only compile away if it's a compile-time constant. But of course, if you're iterating then why use the symbols instead of the numbers?

Deduction42 commented 4 years ago

Because I have a big vector that contains labels to a set of smaller objects. It's basically an unscented kalman filter on a hierarchical model. I am trying to map the values of the smaller objects to the larger vector using the smaller object field names. It's just cleaner to do it this way than trying to keep track of where all these object fields sit in the vector.

Deduction42 commented 4 years ago

Anyhow, I can probably just write something myself for this. A quick solution is to create a named tuple that indexes fields to the vector or to build a new object myself that incorporates all this. I don't think it would be too hard. By the way, thanks for getting back to me so quickly; especially since you're THE ChrisRackauckas. I'm a big fan of your work.

ChrisRackauckas commented 4 years ago

Because I have a big vector that contains labels to a set of smaller objects.

I see. Well, you can read about what's going on here https://www.stochasticlifestyle.com/zero-cost-abstractions-in-julia-indexing-vectors-by-name-with-labelledarrays/ and you can see the function generation needs to know the name at compile time so if you do it from a list, well, it's doomed to not optimize. But what you can do is make that list of symbols instead be a tuple and use https://github.com/cstjean/Unrolled.jl to build flat code, which would then optimize. Note that you don't want to go overboard with that, because if you unroll a 1000 variable loop you can get gigantic functions and now you just have a ton of compile time, but if it's unrolling like 20 things this might be the right way to do it.

Deduction42 commented 4 years ago

Riiiighht! That makes sense, and I think that would actually be really useful. I could see the vector having about 1000 elements in it, but the roll-out would be for a bunch of smaller objects with about 10 elements in them, so I guess it would yield a bunch of slightly larger functions. I'd only need the roll-out functionality when transitioning between vector (in the optimization codebase) and the large hierarchy (in the prediction codebase).

SciML / LabelledArrays.jl

Major performance loss by indexing using symbols. #84