Closed Deduction42 closed 4 years ago
You're timing at the top level scope. If you put it into a function it would compile away.
Well it appears that it only compiles away when you actually use the .field notation. I can't seem to iterate through these objects over the fieldnames (such as with "getproperty/setproperty") without incurring heavy performance losses.
function f()
v = @LArray zeros(3) (:a,:b,:c)
for ii in 1:100000
for k in (:a, :b, :c)
v[k] = v[k] + 1
end
end
return v
end
@time f()
>> 1.851394 seconds (1.20 M allocations: 18.311 MiB, 0.13% gc time)
function f()
v = @LArray zeros(3) (:a,:b,:c)
for ii in 1:100000
for k in (:a, :b, :c)
setproperty!(v, k, getproperty(v, k) + 1)
end
end
return v
end
@time f()
>> 1.862027 seconds (1.20 M allocations: 18.311 MiB, 0.11% gc time)
function f()
v = @LArray zeros(3) (:a,:b,:c)
for ii in 1:100000
v.a = v.a + 1
v.b = v.b + 1
v.c = v.c + 1
end
return v
end
@time f()
>> 0.000206 seconds (1 allocation: 112 bytes)
I can't seem to iterate through these objects over the fieldnames (such as with "getproperty/setproperty") without incurring heavy performance losses.
Yes, that is expected. It can only compile away if it's a compile-time constant. But of course, if you're iterating then why use the symbols instead of the numbers?
Because I have a big vector that contains labels to a set of smaller objects. It's basically an unscented kalman filter on a hierarchical model. I am trying to map the values of the smaller objects to the larger vector using the smaller object field names. It's just cleaner to do it this way than trying to keep track of where all these object fields sit in the vector.
Anyhow, I can probably just write something myself for this. A quick solution is to create a named tuple that indexes fields to the vector or to build a new object myself that incorporates all this. I don't think it would be too hard. By the way, thanks for getting back to me so quickly; especially since you're THE ChrisRackauckas. I'm a big fan of your work.
Because I have a big vector that contains labels to a set of smaller objects.
I see. Well, you can read about what's going on here https://www.stochasticlifestyle.com/zero-cost-abstractions-in-julia-indexing-vectors-by-name-with-labelledarrays/ and you can see the function generation needs to know the name at compile time so if you do it from a list, well, it's doomed to not optimize. But what you can do is make that list of symbols instead be a tuple and use https://github.com/cstjean/Unrolled.jl to build flat code, which would then optimize. Note that you don't want to go overboard with that, because if you unroll a 1000 variable loop you can get gigantic functions and now you just have a ton of compile time, but if it's unrolling like 20 things this might be the right way to do it.
Riiiighht! That makes sense, and I think that would actually be really useful. I could see the vector having about 1000 elements in it, but the roll-out would be for a bunch of smaller objects with about 10 elements in them, so I guess it would yield a bunch of slightly larger functions. I'd only need the roll-out functionality when transitioning between vector (in the optimization codebase) and the large hierarchy (in the prediction codebase).
I tried using this package to try to make my code cleaner when indexing arrays, but as soon as I started using it, the performance dropped considerably. I ran a few benchmarks which I found disappointing, particularly for the claim "The LArrays are fully mutable arrays with labels. There is no performance loss by using the labelled instead of indexing."
This is done after multiple trials (to ensure we're not including JIT compile time). It looks like indexing by label is 100x times slower than indexing by number (but indexing an LArray is just as fast as a regular array when indexing by number). Is this a bug, am I using this wrong, or is the zero performance from labelled indexing no longer an objective?