Slow lenses for nested named tuples

cscherrer commented 3 years ago

Here's an interesting benchmark:

x = (a=1,b=(c=2,d=(e=3,f=(g=4,h=(i=5,j=(k=6,l=(m=7,n=(o=8,p=(q=9,r=(s=10,t=(u=11,v=(w=12,x=(y=13,z=14)))))))))))))

using BenchmarkTools
using Setfield

ℓ1  = @lens _.a
ℓ2  = @lens _.b.c
ℓ3  = @lens _.b.d.e
ℓ4  = @lens _.b.d.f.g
ℓ5  = @lens _.b.d.f.h.i
ℓ6  = @lens _.b.d.f.h.j.k
ℓ7  = @lens _.b.d.f.h.j.l.m
ℓ8  = @lens _.b.d.f.h.j.l.n.o
ℓ9  = @lens _.b.d.f.h.j.l.n.p.q
ℓ10 = @lens _.b.d.f.h.j.l.n.p.r.s
ℓ11 = @lens _.b.d.f.h.j.l.n.p.r.t.u
ℓ12 = @lens _.b.d.f.h.j.l.n.p.r.t.v.w
ℓ13 = @lens _.b.d.f.h.j.l.n.p.r.t.v.x.y
ℓ14 = @lens _.b.d.f.h.j.l.n.p.r.t.v.x.z

t = []
for ℓ in [ℓ1,ℓ2,ℓ3,ℓ4,ℓ5,ℓ6,ℓ7,ℓ8,ℓ9,ℓ10,ℓ11,ℓ12,ℓ13,ℓ14]
    [push!(t, @belapsed get($x, $ℓ))]
end

Then compare:

julia> t
14-element Array{Any,1}:
 1.0000000000000001e-11
 1.0000000000000001e-11
 1.0000000000000001e-11
 1.152e-9
 1.152e-9
 1.152e-9
 1.152e-9
 1.152e-9
 1.162e-9
 1.162e-9
 1.162e-9
 1.162e-9
 1.162e-9
 1.162e-9

julia> @belapsed $x.b.d.f.h.j.l.n.p.r.t.v.x.z
2.0000000000000002e-11

My big worry here was that time might be linear with depth. That doesn't seem to be the case, which is great! But Setfield still takes 50x the time of the "dotted path" approach. Do you think it's possible to close this gap?

jw3126 commented 3 years ago

Thanks for reporting. It is interesting indeed. I never used such deep compositions. Here are a couple of thoughts:

Overall I am confident that this overhead could be eliminated. However, I think more interesting and harder than get performance is set performance.
There are a couple of places where an @inline could be added. That would be the first thing to try: https://github.com/jw3126/Setfield.jl/blob/master/src/lens.jl#L162 https://github.com/jw3126/Setfield.jl/blob/master/src/lens.jl#L104 https://github.com/jw3126/Setfield.jl/blob/master/src/lens.jl#L128-L137
Note that while mathematically composition is associative, the type system remembers the composition order:
```
julia> l1 = (@lens(_.a) ∘ @lens(_.b)) ∘ @lens(_.c)
(@lens _.a.b.c)
```

julia> l2 = @lens(.a) ∘ (@lens(.b) ∘ @lens(.c)) (@lens .a.b.c)

julia> l1 === l2 false

julia> typeof(l1) Setfield.ComposedLens{Setfield.ComposedLens{Setfield.PropertyLens{:a},Setfield.PropertyLens{:b}},Setfield.PropertyLens{:c}}

julia> typeof(l2) Setfield.ComposedLens{Setfield.PropertyLens{:a},Setfield.ComposedLens{Setfield.PropertyLens{:b},Setfield.PropertyLens{:c}}}

The semantics of a composed lens is independent of composition order, but performance is not. I expect in a practical use case, you won't build such a huge lens in one step
```julia
@lens _.a ... z

but combine it out of smaller lenses:

(@lens _a....m) ∘(@lens _n... z)

Is this issue inspired by an existing or planned use case? I would be happy to learn about it. Might be worth benchmarking with a more practical composition order.

If you want to tackle this issue I am happy to provide guidance. You should target Accessors.jl instead of Setfield.jl.

cscherrer commented 3 years ago

Thanks for the response. I'll give some more context on the problem...

A sample in Soss.jl is a named tuple. You can have models within models, or some parameters could themselves be structs or named tuples, so in large models I'd expect this much nesting could come up.

That's for a single sample, but we usually want many samples. But that gets inefficient, so I was looking into StructArrays. As it turns out, these already allow nesting, but that's not well-documented, so I started trying to implement it. My work-in-progress PR is here: https://github.com/JuliaArrays/StructArrays.jl/pull/160

In this case set performance is an easier problem, because we'll just be mutating arrays elements.

I'm not too worried about the possibility of matching performance. All the information is in the type, so a generated function should do it. The @code_typed is exactly the same between the two (which... weird, right?), but

julia> @code_lowered get(x,ℓ14)
CodeInfo(
1 ─ %1 = Base.getproperty(l, :outer)
│        inner_obj = Setfield.get(obj, %1)
│   %3 = inner_obj
│   %4 = Base.getproperty(l, :inner)
│   %5 = Setfield.get(%3, %4)
└──      return %5
)

julia> @code_lowered f(x)
CodeInfo(
1 ─ %1  = Base.getproperty(x, :b)
│   %2  = Base.getproperty(%1, :d)
│   %3  = Base.getproperty(%2, :f)
│   %4  = Base.getproperty(%3, :h)
│   %5  = Base.getproperty(%4, :j)
│   %6  = Base.getproperty(%5, :l)
│   %7  = Base.getproperty(%6, :n)
│   %8  = Base.getproperty(%7, :p)
│   %9  = Base.getproperty(%8, :r)
│   %10 = Base.getproperty(%9, :t)
│   %11 = Base.getproperty(%10, :v)
│   %12 = Base.getproperty(%11, :x)
│   %13 = Base.getproperty(%12, :z)
└──       return %13
)

Anyway, I could call getproperty directly, but I think Setfield or Accessors would be more elegant. I mostly wanted to be sure you're aware of this, not sure yet how deeply I'll dig in.

cscherrer commented 3 years ago

@jw3126 I asked about this on Discourse, seems it might be a non-issue: https://discourse.julialang.org/t/code-lowered-vs-code-typed/51346?u=cscherrer

jw3126 commented 3 years ago

Ahh good catch. Yeah 1e-11 thats a fast CPU... If you want you can add benchmarks here so deep compositions will stay fast. I will close the issue, feel free to reopen if you think the issue is real.

jw3126 / Setfield.jl

Slow lenses for nested named tuples #151