peremato / Geom4hep

4 stars 4 forks source link

Hunting allocation #1

Open Moelf opened 1 year ago

Moelf commented 1 year ago

if I run through --track-allocation=user, the only allocation I see is:


julia> generateXRay(nav2, world2, 1000, 1);

julia> Profile.clear_malloc_data()

julia> generateXRay(nav2, world2, 1000, 1);

        - function getClosestDaughter(nav::NAV, volume::Volume{T}, point::Point3{T}, dir::Vector3{T}, step_limit::T ) where {T,NAV}
        -     step = step_limit
        -     candidate = 0
        -     #---Linear loop over the daughters-------------------------------------------------
        0     for pvol in intersectedDaughters(nav, volume, point, dir)
        -         #---Assuming that it is not yet inside the daughter (otherwise it returns -1.)
 59378816         dist = distanceToIn(pvol, point, dir)
        0         dist < 0. && return (zero(T), pvol.idx ) 
        0         if dist > 0. && dist != Inf && dist < step
        -             step = dist
        0             candidate = pvol.idx
        -         end
        0     end
        0     return step, candidate
        - end
        - 
        - #---Update the NavigatorState and get the step (distance)
        - function computeStep!(state::NavigatorState{T,NAV}, gpoint::Point3{T}, gdir::Vector3{T}, step_limit::T) where {T,NAV}
        0     lpoint, ldir = transform(state.tolocal, gpoint, gdir) 
        0     volume = currentVolume(state)
        0     step, idx = getClosestDaughter(state.navigator, volume, lpoint, ldir, step_limit)
        -     #---If didn't hit any daughter return distance to out
        0     if idx == 0
  3914320         step = distanceToOut(volume.shape, lpoint, ldir)
        0         if step > -kTolerance(T)
        0             popOut!(state)
        0         elseif step < 0
        0             shape = volume.shape
        0             volstack = state.volstack
        0             error(lazy"Negative distanceToOut. \nstep = $step \nshape = $shape \nvolstack = $volstack \npoint = $lpoint \ndir =   $ldir" )
        -         end
        -     else
        -     #---We hit a daughter, push it into the stack
        0         step += kPushTolerance(T)  # to ensure that do not stay in the surface of the daughter
        0         pvol = volume.daughters[idx]
        0         pushIn!(state, pvol)
        -     end
        0     return step
        - end
Moelf commented 1 year ago

one general comment I have is don't over use typing system for invariance enforcement, for example:

struct VolumeP{T<:AbstractFloat,PV}
    label::String
    shape::Shape{T}                     # Reference to the actual shape
    material::Material{T}               # Reference to material
    daughters::Vector{PV}
end

#---PlacedVolume-------------------------------------------------------------------
struct PlacedVolume{T<:AbstractFloat}
    idx::Int64
    transformation::Transformation3D{T}
    volume::VolumeP{T,PlacedVolume{T}}
end

I wouldn't put volume::VolumeP{T,PlacedVolume{T}}, but rather just leave it as another floating parameter in PlacedVolume. I guess my point being I don't want to program in typing lattice world and I don't trust the performance of inference, I'd rather give compiler more freedom (i.e. give PlacedVolumn another parametric type), not sure if it helps here, we will see

Moelf commented 1 year ago

Curiously, Cthulhu.jl only has static dispatch

%704  = invoke computeStep!(::NavigatorState{…},::Point{…},::SArray{…},::Float64)::Float64
----
^  %17  = invoke distanceToOut(::Box{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %22  = invoke distanceToOut(::Cone{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %40  = invoke distanceToOut(::Plane{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %41  = invoke distanceToOut(::Plane{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %59  = invoke distanceToOut(::Tube{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %69  = invoke distanceToOut(::Trap{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %74  = invoke distanceToOut(::Trd{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %79  = invoke distanceToOut(::Tube{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
 • %84  = invoke distanceToOut(::NoShape{…},::Point{…},::SArray{…})::Float64
v  %120  = invoke throw_inexacterror(::Symbol,::Type{UInt64},::Int64)::Union{}
--- #inside computeStep!
   %1  = invoke intersectedDaughters(::BVHNavigator{…},::VolumeP{…},::Point{…},::SArray{…})::…
   %157  = invoke distanceToIn(::Box{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %162  = invoke distanceToIn(::Cone{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %167  = invoke distanceToIn(::CutTube{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %172  = invoke distanceToIn(::Trap{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %177  = invoke distanceToIn(::Trd{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
 • %182  = invoke distanceToIn(::Tube{Float64},::Point3{Float64},::SVector{3, Float64})::Float64
   %187  = invoke distanceToIn(::NoShape{…},::Point{…},::SArray{…})::Float64
Moelf commented 1 year ago

JET.jl has a ton of stuff

julia> @report_opt generateXRay(nav2, world2, 1000, 1)
═════ 113 possible errors found ═════
┌ @ /home/akako/Documents/github/Geom4hep/examples/XRay.jl:11 world = getWorld(vol)
│┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:79 extent(vol.shape)
││┌ @ /home/akako/Documents/github/Geom4hep/src/Boolean.jl:43 OP Geom4hep.:(==) :Union
│││┌ @ gcutils.jl:4 isequal(%1, v)
││││ runtime dispatch detected: isequal(%1::Any, v::Symbol)::Bool
│││└────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:195 Geom4hep.extent(::NoShape{Float64, PlacedVolume{Float64}})
│││ failed to optimize: Geom4hep.extent(::NoShape{Float64, PlacedVolume{Float64}})
││└───────────────────────────────────────────────────────────
││┌ @ none:0 extent(pvol.volume.shape)
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Polycone.jl:72  = iterate(pcone.sections)
││││┌ @ abstractarray.jl:1165 eachindex(A)
│││││┌ @ abstractarray.jl:285 Base.axes1(A)
││││││┌ @ abstractarray.jl:116 axes(A)
│││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/abstractarray.jl:11 Size(s)
││││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/traits.jl:88 Size(T)
│││││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/traits.jl:90 StaticArrays.missing_size_error(SA)
││││││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/traits.jl:73 string("The size of type `", SA, "` is not known.\n\nIf you were trying to construct (or `convert` to) a `StaticArray` you\nmay need to add the size explicitly as a type parameter so its size is\ninferrable to the Julia compiler (or performance would be terrible). For\nexample, you might try\n\n    m = zeros(3,3)\n    SMatrix(m)            # this error\n    SMatrix{3,3}(m)       # correct - size is inferrable\n    SArray{Tuple{3,3}}(m) # correct, note Tuple{3,3}\n")
│││││││││││┌ @ strings/io.jl:185 Base.print_to_string(xs...)
││││││││││││┌ @ strings/io.jl:144 print(s, x)
│││││││││││││┌ @ strings/io.jl:35 show(io, x)
││││││││││││││┌ @ show.jl:881 Base._show_type(io, Base.inferencebarrier(x))
│││││││││││││││┌ @ show.jl:884 Base.show_type_name(io, typeassert(Base.unwrap_unionall(x), Base.DataType).name)
││││││││││││││││┌ @ show.jl:971 show(io, tn.module)
│││││││││││││││││┌ @ show.jl:1103 Base.is_root_module(m)
││││││││││││││││││┌ @ lock.jl:221 lock(temp)
│││││││││││││││││││┌ @ lock.jl:103 slowlock(rl)
││││││││││││││││││││┌ @ lock.jl:112 wait(c)
│││││││││││││││││││││┌ @ condition.jl:126 Base.list_deletefirst!(ct.queue, ct)
││││││││││││││││││││││┌ @ linked_list.jl:145 isequal(h.value, val)
│││││││││││││││││││││││┌ @ gcutils.jl:4 isequal(%1, v)
││││││││││││││││││││││││ runtime dispatch detected: isequal(%1::Any, v::Task)::Bool
│││││││││││││││││││││││└────────────────
│││││││││││││││││││││┌ @ condition.jl:126 Base.list_deletefirst!(%45, %39)
││││││││││││││││││││││ runtime dispatch detected: Base.list_deletefirst!(%45::Any, %39::Task)::Any
│││││││││││││││││││││└────────────────────
││││││││││││││││││┌ @ lock.jl:225 unlock(temp)
│││││││││││││││││││┌ @ lock.jl:133 _unlock(rl)
││││││││││││││││││││┌ @ lock.jl:139 notifywaiters(rl)
│││││││││││││││││││││┌ @ lock.jl:143  = notify(cond_wait)
││││││││││││││││││││││┌ @ condition.jl:142 #self#(c, Base.nothing)
│││││││││││││││││││││││┌ @ condition.jl:142 Base.:(var"#notify#585")(true, false, #self#, c, arg)
││││││││││││││││││││││││┌ @ condition.jl:142 notify(c, arg, all, error)
│││││││││││││││││││││││││┌ @ condition.jl:148 Core.kwfunc(schedule)(NamedTuple{(:error,)}(tuple(error)), schedule, t, arg)
││││││││││││││││││││││││││┌ @ task.jl:789 Base.:(var"#schedule#612")(error, _3, t, arg)
│││││││││││││││││││││││││││┌ @ task.jl:793 %10(%11, t)
││││││││││││││││││││││││││││ runtime dispatch detected: %10::typeof(Base.list_deletefirst!)(%11::Any, t::Task)::Any
│││││││││││││││││││││││││││└───────────────
│││││││││││││││┌ @ show.jl:886 Base.show_typealias(io, x)
││││││││││││││││┌ @ show.jl:721 alias = Base.make_typealias(properx)
│││││││││││││││││┌ @ show.jl:531 mods = Base.modulesof!(Set{Module}(), x)
││││││││││││││││││┌ @ show.jl:506 Base.modulesof!(s, %20)
│││││││││││││││││││ runtime dispatch detected: Base.modulesof!(s::Set{Module}, %20::Any)::Any
││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:507 Base.modulesof!(s, %30)
│││││││││││││││││││ runtime dispatch detected: Base.modulesof!(s::Set{Module}, %30::Any)::Any
││││││││││││││││││└───────────────
│││││││││││││││││┌ @ show.jl:535 Base.uniontypes(Base.unwrap_unionall(x))
││││││││││││││││││┌ @ reflection.jl:931 Base._uniontypes(x, Base.Any[])
│││││││││││││││││││┌ @ reflection.jl:929 Base._uniontypes(%1, ts)
││││││││││││││││││││ runtime dispatch detected: Base._uniontypes(%1::Any, ts::Vector{Any})::Any
│││││││││││││││││││└─────────────────────
│││││││││││││││││││┌ @ reflection.jl:929 Base._uniontypes(%3, ts)
││││││││││││││││││││ runtime dispatch detected: Base._uniontypes(%3::Any, ts::Vector{Any})::Any
│││││││││││││││││││└─────────────────────
││││││││││││││││││┌ @ reflection.jl:931 Base._uniontypes(x, %1)
│││││││││││││││││││ runtime dispatch detected: Base._uniontypes(x::Any, %1::Vector{Any})::Vector{Any}
││││││││││││││││││└─────────────────────
│││││││││││││││││┌ @ show.jl:566 applied = Base.rewrap_unionall(applied, p)
││││││││││││││││││┌ @ essentials.jl:273 Base.rewrap_unionall(%5, u)
│││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(%5::Any, u::UnionAll)::Any
││││││││││││││││││└─────────────────────
││││││││││││││││││┌ @ essentials.jl:273 Base.rewrap_unionall(%8, %9)
│││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(%8::Any, %9::UnionAll)::Any
││││││││││││││││││└─────────────────────
││││││││││││││││││┌ @ essentials.jl:265 Base.rewrap_unionall(t, %3)
│││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(t::Any, %3::Any)::Any
││││││││││││││││││└─────────────────────
││││││││││││││││││┌ @ essentials.jl:265 Base.rewrap_unionall(t, %8)
│││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(t::Any, %8::Any)::Any
││││││││││││││││││└─────────────────────
│││││││││││││││││┌ @ show.jl:566 Base.rewrap_unionall(%373, %371)
││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(%373::Any, %371::UnionAll)::Any
│││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:724 Base.show_typealias(io, alias[1], x, alias[2], wheres)
│││││││││││││││││┌ @ show.jl:671 Base.show_typeparams(io, env, Core.svec(vars...), wheres)
││││││││││││││││││┌ @ show.jl:635 show(io, p)
│││││││││││││││││││┌ @ show.jl:2458 show_bound(io, lb)
││││││││││││││││││││┌ @ show.jl:2450 show(io, b)
│││││││││││││││││││││ runtime dispatch detected: show(io::IOContext{IOBuffer}, b::Any)::Any
││││││││││││││││││││└────────────────
││││││││││││││││││┌ @ show.jl:630 show(io, %262)
│││││││││││││││││││ runtime dispatch detected: show(io::IOContext{IOBuffer}, %262::Any)::Any
││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:633 show(io, %319)
│││││││││││││││││││ runtime dispatch detected: show(io::IOContext{IOBuffer}, %319::Any)::Any
││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:638 show(io, %204)
│││││││││││││││││││ runtime dispatch detected: show(io::IOContext{IOBuffer}, %204::Any)::Any
││││││││││││││││││└───────────────
│││││││││││││││┌ @ show.jl:889 Base.show_datatype(io, x)
││││││││││││││││┌ @ show.jl:989 #self#(io, x, Base.TypeVar[])
│││││││││││││││││┌ @ show.jl:1011 Base.show_typeparams(io, parameters, typeassert(Base.unwrap_unionall(x.name.wrapper), Base.DataType).parameters, wheres)
││││││││││││││││││┌ @ show.jl:635 show(io, p)
│││││││││││││││││││┌ @ show.jl:2458 show_bound(io, lb)
││││││││││││││││││││┌ @ show.jl:2450 show(io, b)
│││││││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, b::Any)::Any
││││││││││││││││││││└────────────────
││││││││││││││││││┌ @ show.jl:630 show(io, %260)
│││││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, %260::Any)::Any
││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:633 show(io, %316)
│││││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, %316::Any)::Any
││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:638 show(io, %203)
│││││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, %203::Any)::Any
││││││││││││││││││└───────────────
│││││││││││││││││┌ @ show.jl:997 show(io, %22)
││││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, %22::Any)::Any
│││││││││││││││││└───────────────
│││││││││││││││││┌ @ show.jl:1005 show(io, %64)
││││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, %64::Any)::Any
│││││││││││││││││└────────────────
│││││││││││││││┌ @ show.jl:892 Base.show_unionaliases(io, x)
││││││││││││││││┌ @ show.jl:812 Base.make_typealiases(properx)
│││││││││││││││││┌ @ show.jl:733 mods = Base.modulesof!(Set{Module}(), x)
││││││││││││││││││┌ @ show.jl:506 Base.modulesof!(s, x.a)
│││││││││││││││││││┌ @ show.jl:500 Base.modulesof!(s, %1)
││││││││││││││││││││ runtime dispatch detected: Base.modulesof!(s::Set{Module}, %1::Any)::Set{Module}
│││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:506 Base.modulesof!(s, %3)
│││││││││││││││││││ runtime dispatch detected: Base.modulesof!(s::Set{Module}, %3::Any)::Any
││││││││││││││││││└───────────────
││││││││││││││││││┌ @ show.jl:507 Base.modulesof!(s, %12)
│││││││││││││││││││ runtime dispatch detected: Base.modulesof!(s::Set{Module}, %12::Any)::Any
││││││││││││││││││└───────────────
│││││││││││││││││┌ @ show.jl:768 ul = Base.unionlen(applied)
││││││││││││││││││┌ @ reflection.jl:926 Base.unionlen(%1)
│││││││││││││││││││ runtime dispatch detected: Base.unionlen(%1::Any)::Int64
││││││││││││││││││└─────────────────────
││││││││││││││││││┌ @ reflection.jl:926 Base.unionlen(%3)
│││││││││││││││││││ runtime dispatch detected: Base.unionlen(%3::Any)::Int64
││││││││││││││││││└─────────────────────
│││││││││││││││││┌ @ show.jl:740 Base.rewrap_unionall(%42, x)
││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(%42::Any, x::Union)::Any
│││││││││││││││││└───────────────
│││││││││││││││││┌ @ show.jl:768 Base.unionlen(%390)
││││││││││││││││││ runtime dispatch detected: Base.unionlen(%390::Any)::Int64
│││││││││││││││││└───────────────
│││││││││││││││││┌ @ show.jl:770 Base.rewrap_unionall(%414, %412)
││││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(%414::Any, %412::UnionAll)::Any
│││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:812 Base.make_typealiases(%10)
│││││││││││││││││ runtime dispatch detected: Base.make_typealiases(%10::Union{Type{Any}, Union})::Tuple{Vector{Core.SimpleVector}, Type}
││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:820 Base.rewrap_unionall(%47, %10)
│││││││││││││││││ runtime dispatch detected: Base.rewrap_unionall(%47::Any, %10::Union{Type{Any}, Union})::Any
││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:825 show(io, %47)
│││││││││││││││││ runtime dispatch detected: show(io::IOBuffer, %47::Any)::Any
││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:831 Base.show_typealias(io, %137, x, %127, %128)
│││││││││││││││││ runtime dispatch detected: Base.show_typealias(io::IOBuffer, %137::Any, x::Union, %127::Core.SimpleVector, %128::Vector{TypeVar})::Any
││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:839 Base.show_typealias(io, %205, x, %195, %196)
│││││││││││││││││ runtime dispatch detected: Base.show_typealias(io::IOBuffer, %205::Any, x::Union, %195::Core.SimpleVector, %196::Vector{TypeVar})::Any
││││││││││││││││└───────────────
│││││││││││││││┌ @ show.jl:896 Base.show_delim_array(io, Base.uniontypes(x), {, ,, }, false)
││││││││││││││││┌ @ show.jl:1201 #self#(io, itr, op, delim, cl, delim_one, first(LinearIndices(itr)), last(LinearIndices(itr)))
│││││││││││││││││┌ @ show.jl:1212 show(%5, %42)
││││││││││││││││││ runtime dispatch detected: show(%5::IOContext{IOBuffer}, %42::Any)::Any
│││││││││││││││││└────────────────
│││││││││││││││┌ @ show.jl:923 Base.show_datatype(io, x, wheres)
││││││││││││││││┌ @ show.jl:1010 Base.show_type_name(io, x.name)
│││││││││││││││││┌ @ show.jl:970 Base.isvisible(%36, %84, %78)
││││││││││││││││││ runtime dispatch detected: Base.isvisible(%36::Symbol, %84::Module, %78::Any)::Bool
│││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:997 show(io, %22)
│││││││││││││││││ runtime dispatch detected: show(io::IOContext{IOBuffer}, %22::Any)::Any
││││││││││││││││└───────────────
││││││││││││││││┌ @ show.jl:1005 show(io, %66)
│││││││││││││││││ runtime dispatch detected: show(io::IOContext{IOBuffer}, %66::Any)::Any
││││││││││││││││└────────────────
│││││││││││││││┌ @ show.jl:925 show(%254, %80)
││││││││││││││││ runtime dispatch detected: show(%254::IOContext{IOBuffer}, %80::Any)::Any
│││││││││││││││└───────────────
││││││││││││││┌ @ show.jl:881 Base._show_type(io, x)
│││││││││││││││ runtime dispatch detected: Base._show_type(io::IOBuffer, x::Type)::Nothing
││││││││││││││└───────────────
│││││││││││││┌ @ strings/io.jl:35 show(io, x)
││││││││││││││ runtime dispatch detected: show(io::IOBuffer, x::Type)::Any
│││││││││││││└────────────────────
││││││││││││┌ @ strings/io.jl:139 Base._str_sizehint(%4)
│││││││││││││ runtime dispatch detected: Base._str_sizehint(%4::Union{String, Type})::Int64
││││││││││││└─────────────────────
││││││││││││┌ @ strings/io.jl:144 print(%39, %42)
│││││││││││││ runtime dispatch detected: print(%39::IOBuffer, %42::Union{String, Type})::Any
││││││││││││└─────────────────────
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Polycone.jl:80 SVector{3, Float64}(0, 0, pcone.zᵢ[N] Geom4hep.:+ pcone.sections[N].z)
││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:160 StaticArrays.construct_type(SA, StaticArrays.Args(x))
│││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:87 StaticArrays.adapt_size(SA, x)
││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:94 StaticArrays.length_match_size(SA, x)
│││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:125 StaticArrays._no_precise_size(SA, x)
││││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:155 StaticArrays._no_precise_size(SA, x.args)
│││││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:156 string("No precise constructor for ", SA, " found. Length of input was ", StaticArrays.length(x), ".")
││││││││││┌ @ strings/io.jl:185 Base.print_to_string(xs...)
│││││││││││┌ @ strings/io.jl:144 print(s, x)
││││││││││││┌ @ strings/io.jl:35 show(io, x)
│││││││││││││┌ @ show.jl:881 Base._show_type(io, x)
││││││││││││││ runtime dispatch detected: Base._show_type(io::IOBuffer, x::DataType)::Nothing
│││││││││││││└───────────────
│││││││││││┌ @ strings/io.jl:144 print(%83, %86)
││││││││││││ runtime dispatch detected: print(%83::IOBuffer, %86::Any)::Any
│││││││││││└─────────────────────
││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:97 Length(x)
│││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:9 Length(StaticArrays.length(x.args))
││││││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/traits.jl:116 %1()
│││││││││ runtime dispatch detected: %1::Type{Length{_A}} where _A()::Length
││││││││└────────────────────────────────────────────────────────────────────
││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:160 StaticArrays.Args(x)
│││││ runtime dispatch detected: StaticArrays.Args(x::Tuple{Int64, Int64, Any})::StaticArrays.Args
││││└─────────────────────────────────────────────────────────────────────
││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:160 StaticArrays.construct_type(%1, %2)
│││││ runtime dispatch detected: StaticArrays.construct_type(%1::Type{SVector{3, Float64}}, %2::StaticArrays.Args)::Any
││││└─────────────────────────────────────────────────────────────────────
││││┌ @ /home/akako/.julia/packages/StaticArrays/6QFsp/src/convert.jl:160 %3(x)
│││││ runtime dispatch detected: %3::Any(x::Tuple{Int64, Int64, Any})::Any
││││└─────────────────────────────────────────────────────────────────────
││┌ @ none:0 extent(%2)
│││ runtime dispatch detected: extent(%2::Shape{Float64})::Tuple{Point3{Float64}, Point3{Float64}}
││└──────────
│┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:82 mat = Core.kwfunc(Material{Float64})(NamedTuple{(:density,)}(tuple(0)), Material{Float64}, "vacuum")
││┌ @ /home/akako/Documents/github/Geom4hep/src/Materials.jl:21 [quote](state, density, temperature, mass, composition, _3, name)
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Materials.jl:23 convert(fieldtype(Material{Float64}, 6), composition)
││││┌ @ array.jl:617 T(a)
│││││┌ @ array.jl:626 Base.copyto_axcheck!(Vector{NamedTuple{(:fraction, :element), Tuple{Float64, Element{Float64}}}}(Base.undef, size(x)), x)
││││││┌ @ abstractarray.jl:1127 copyto!(dest, src)
│││││││┌ @ array.jl:343 copyto!(dest, 1, src, 1, length(src))
││││││││┌ @ array.jl:317 Base._copyto_impl!(dest, doffs, src, soffs, n)
│││││││││┌ @ array.jl:331 unsafe_copyto!(dest, doffs, src, soffs, n)
││││││││││┌ @ array.jl:307 Base._unsafe_copyto!(dest, doffs, src, soffs, n)
│││││││││││┌ @ array.jl:253 dest[doffs + i - 1] = src[soffs + i - 1]
││││││││││││┌ @ array.jl:966 convert(T, x)
│││││││││││││┌ @ namedtuple.jl:151 T(nt)
││││││││││││││┌ @ namedtuple.jl:156 convert(T, Base.Tuple(nt))
│││││││││││││││┌ @ essentials.jl:339 Val(N)
││││││││││││││││┌ @ essentials.jl:714 %1()
│││││││││││││││││ runtime dispatch detected: %1::Type{Val{_A}} where _A()::Val
││││││││││││││││└─────────────────────
│││││││││││┌ @ array.jl:261 dest[%73] = %71
││││││││││││ runtime dispatch detected: (dest::Vector{NamedTuple{(:fraction, :element), Tuple{Float64, Element{Float64}}}})[%73::Int64] = %71::Any::Any
│││││││││││└────────────────
│││││││││││┌ @ array.jl:253 dest[%182] = %180
││││││││││││ runtime dispatch detected: (dest::Vector{NamedTuple{(:fraction, :element), Tuple{Float64, Element{Float64}}}})[%182::Int64] = %180::Any::Any
│││││││││││└────────────────
│┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:79 extent(%5)
││ runtime dispatch detected: extent(%5::Shape{Float64})::Tuple{Point3{Float64}, Point3{Float64}}
│└──────────────────────────────────────────────────────────
┌ @ /home/akako/Documents/github/Geom4hep/examples/XRay.jl:30 locateGlobalPoint!(state, point)
│┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:131 Geom4hep.contains(vol, point)
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:60 inside(vol.shape, p)
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:202 inout = inside(pvol.volume.shape, pvol.transformation Geom4hep.:* point)
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Polycone.jl:85 indexLow = getSectionIndex(pcone, z Geom4hep.:- kTolerance(T))
│││││┌ @ /home/akako/Documents/github/Geom4hep/src/Polycone.jl:38 1 Geom4hep.:(:) N
││││││┌ @ range.jl:3 promote(a, b)
│││││││┌ @ promotion.jl:359 Base._promote(x, y)
││││││││┌ @ promotion.jl:335 R = promote_type(T, S)
│││││││││┌ @ promotion.jl:298 Base.promote_result(T, S, promote_rule(T, S), promote_rule(S, T))
││││││││││┌ @ promotion.jl:312 promote_type(T, S)
│││││││││││┌ @ promotion.jl:298 Base.promote_result(T, S, promote_rule(T, S), promote_rule(S, T))
││││││││││││┌ @ promotion.jl:315 typejoin(T, S)
│││││││││││││┌ @ promotion.jl:27 typejoin(a, b.body)
││││││││││││││┌ @ promotion.jl:31 typejoin(b.a, b.b)
│││││││││││││││┌ @ promotion.jl:54 Base.unwrapva(%125)
││││││││││││││││ runtime dispatch detected: Base.unwrapva(%125::Any)::Any
│││││││││││││││└───────────────────
│││││││││││││││┌ @ promotion.jl:64 Base.unwrapva(%195)
││││││││││││││││ runtime dispatch detected: Base.unwrapva(%195::Any)::Any
│││││││││││││││└───────────────────
│││││││││││││││┌ @ promotion.jl:77 Base.unwrapva(%266)
││││││││││││││││ runtime dispatch detected: Base.unwrapva(%266::Any)::Any
│││││││││││││││└───────────────────
│││││││││││││││┌ @ promotion.jl:77 Base.unwrapva(%285)
││││││││││││││││ runtime dispatch detected: Base.unwrapva(%285::Any)::Any
│││││││││││││││└───────────────────
│││││││││││││││┌ @ promotion.jl:112 %453.var
││││││││││││││││ runtime dispatch detected: (%453::Any).var::Any
│││││││││││││││└────────────────────
│││││││││││││││┌ @ promotion.jl:113 %453.body
││││││││││││││││ runtime dispatch detected: (%453::Any).body::Any
│││││││││││││││└────────────────────
│││││││││││││││┌ @ promotion.jl:117 Base.UnionAll(%567, %569)
││││││││││││││││ runtime dispatch detected: Base.UnionAll(%567::Any, %569::Any)::Any
│││││││││││││││└────────────────────
│││││││││││││││┌ @ promotion.jl:229 Base.unwrapva(%19)
││││││││││││││││ runtime dispatch detected: Base.unwrapva(%19::Any)::Any
│││││││││││││││└────────────────────
│││││││││││││││┌ @ promotion.jl:233 Base.unwrapva(%73)
││││││││││││││││ runtime dispatch detected: Base.unwrapva(%73::Any)::Any
│││││││││││││││└────────────────────
││││││││││││││┌ @ promotion.jl:54 Base.unwrapva(%102)
│││││││││││││││ runtime dispatch detected: Base.unwrapva(%102::Any)::Any
││││││││││││││└───────────────────
││││││││││││││┌ @ promotion.jl:64 Base.unwrapva(%172)
│││││││││││││││ runtime dispatch detected: Base.unwrapva(%172::Any)::Any
││││││││││││││└───────────────────
││││││││││││││┌ @ promotion.jl:77 Base.unwrapva(%243)
│││││││││││││││ runtime dispatch detected: Base.unwrapva(%243::Any)::Any
││││││││││││││└───────────────────
││││││││││││││┌ @ promotion.jl:77 Base.unwrapva(%262)
│││││││││││││││ runtime dispatch detected: Base.unwrapva(%262::Any)::Any
││││││││││││││└───────────────────
││││││││││││││┌ @ promotion.jl:112 %430.var
│││││││││││││││ runtime dispatch detected: (%430::Any).var::Any
││││││││││││││└────────────────────
││││││││││││││┌ @ promotion.jl:113 %430.body
│││││││││││││││ runtime dispatch detected: (%430::Any).body::Any
││││││││││││││└────────────────────
││││││││││││││┌ @ promotion.jl:117 Base.UnionAll(%544, %546)
│││││││││││││││ runtime dispatch detected: Base.UnionAll(%544::Any, %546::Any)::Any
││││││││││││││└────────────────────
│││││││││││││┌ @ promotion.jl:54 Base.unwrapva(%97)
││││││││││││││ runtime dispatch detected: Base.unwrapva(%97::Any)::Any
│││││││││││││└───────────────────
│││││││││││││┌ @ promotion.jl:64 Base.unwrapva(%167)
││││││││││││││ runtime dispatch detected: Base.unwrapva(%167::Any)::Any
│││││││││││││└───────────────────
│││││││││││││┌ @ promotion.jl:77 Base.unwrapva(%238)
││││││││││││││ runtime dispatch detected: Base.unwrapva(%238::Any)::Any
│││││││││││││└───────────────────
│││││││││││││┌ @ promotion.jl:77 Base.unwrapva(%257)
││││││││││││││ runtime dispatch detected: Base.unwrapva(%257::Any)::Any
│││││││││││││└───────────────────
│││││││││││││┌ @ promotion.jl:112 %425.var
││││││││││││││ runtime dispatch detected: (%425::Any).var::Any
│││││││││││││└────────────────────
│││││││││││││┌ @ promotion.jl:113 %425.body
││││││││││││││ runtime dispatch detected: (%425::Any).body::Any
│││││││││││││└────────────────────
│││││││││││││┌ @ promotion.jl:117 Base.UnionAll(%539, %541)
││││││││││││││ runtime dispatch detected: Base.UnionAll(%539::Any, %541::Any)::Any
│││││││││││││└────────────────────
│││││││┌ @ promotion.jl:360 Base.not_sametype(tuple(x, y), tuple(px, py))
││││││││┌ @ promotion.jl:377 Base.sametype_error(x)
│││││││││┌ @ promotion.jl:383 map(#43, input)
││││││││││┌ @ tuple.jl:222 f(t[1])
│││││││││││┌ @ promotion.jl:384 string(Base.typeof(x))
││││││││││││┌ @ strings/io.jl:185 Base.print_to_string(xs...)
│││││││││││││┌ @ strings/io.jl:144 print(s, x)
││││││││││││││┌ @ strings/io.jl:35 show(io, x)
│││││││││││││││┌ @ show.jl:881 Base._show_type(io, )
││││││││││││││││ runtime dispatch detected: Base._show_type(io::IOBuffer, ::Int64)::Nothing
│││││││││││││││└───────────────
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:202 inside(%25, %81)
││││ runtime dispatch detected: inside(%25::Shape{Float64}, %81::Point3{Float64})::Any
│││└───────────────────────────────────────────────────────────
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:203 %210 Geom4hep.:(==) kInside
││││ runtime dispatch detected: (%210::Any Geom4hep.:(==) kInside)::Any
│││└───────────────────────────────────────────────────────────
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:203 %210 Geom4hep.:(==) kSurface
││││ runtime dispatch detected: (%210::Any Geom4hep.:(==) kSurface)::Any
│││└───────────────────────────────────────────────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:60 inside(%1, p)
│││ runtime dispatch detected: inside(%1::Shape{Float64}, p::Point3{Float64})::Any
││└──────────────────────────────────────────────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:60 %137 Geom4hep.:(==) kInside
│││ runtime dispatch detected: (%137::Any Geom4hep.:(==) kInside)::Any
││└──────────────────────────────────────────────────────────
│┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:140 Geom4hep.contains(pvol, point)
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:59 inside(%2, %58)
│││ runtime dispatch detected: inside(%2::Shape{Float64}, %58::Point3{Float64})::Any
││└──────────────────────────────────────────────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:59 %187 Geom4hep.:(==) kInside
│││ runtime dispatch detected: (%187::Any Geom4hep.:(==) kInside)::Any
││└──────────────────────────────────────────────────────────
┌ @ /home/akako/Documents/github/Geom4hep/examples/XRay.jl:37 step = computeStep!(state, point, dir, T(1000))
│┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:176 getClosestDaughter(state.navigator, volume, lpoint, ldir, step_limit)
││┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:162 dist = distanceToIn(pvol, point, dir)
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:61 distanceToIn(pvol.volume.shape, pvol.transformation Geom4hep.:* p, pvol.transformation Geom4hep.:* d)
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:231 normal(pvol.volume.shape, lpoint) Geom4hep.:⋅ ldir
│││││┌ @ /cache/build/default-amdci4-3/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:843 LinearAlgebra.iterate(x)
││││││ runtime dispatch detected: LinearAlgebra.iterate(x::Nothing)::Union{}
│││││└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:228 inside(%82, %80)
│││││ runtime dispatch detected: inside(%82::Shape{Float64}, %80::Point3{Float64})::Any
││││└───────────────────────────────────────────────────────────
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:229 %211 Geom4hep.:(==) kInside
│││││ runtime dispatch detected: (%211::Any Geom4hep.:(==) kInside)::Any
││││└───────────────────────────────────────────────────────────
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:231 %211 Geom4hep.:(==) kSurface
│││││ runtime dispatch detected: (%211::Any Geom4hep.:(==) kSurface)::Any
││││└───────────────────────────────────────────────────────────
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:231 normal(%276, %80)
│││││ runtime dispatch detected: normal(%276::Shape{Float64}, %80::Point3{Float64})::Union{Nothing, GeometryBasics.Vec3{Float64}, SVector{3, Float64}}
││││└───────────────────────────────────────────────────────────
││││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:232 distanceToIn(%413, %80, %272)
│││││ runtime dispatch detected: distanceToIn(%413::Shape{Float64}, %80::Point3{Float64}, %272::SVector{3, Float64})::Float64
││││└───────────────────────────────────────────────────────────
│││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:61 distanceToIn(%2, %58, %116)
││││ runtime dispatch detected: distanceToIn(%2::Shape{Float64}, %58::Point3{Float64}, %116::SVector{3, Float64})::Float64
│││└──────────────────────────────────────────────────────────
│┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:179 step = distanceToOut(volume.shape, lpoint, ldir)
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:217 inside(%81, %79)
│││ runtime dispatch detected: inside(%81::Shape{Float64}, %79::Point3{Float64})::Any
││└───────────────────────────────────────────────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:218 %210 Geom4hep.:(==) kSurface
│││ runtime dispatch detected: (%210::Any Geom4hep.:(==) kSurface)::Any
││└───────────────────────────────────────────────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:219 %210 Geom4hep.:(==) kInside
│││ runtime dispatch detected: (%210::Any Geom4hep.:(==) kInside)::Any
││└───────────────────────────────────────────────────────────
││┌ @ /home/akako/Documents/github/Geom4hep/src/Volume.jl:219 distanceToOut(%217, %79, %275)
│││ runtime dispatch detected: distanceToOut(%217::Shape{Float64}, %79::Point3{Float64}, %275::SVector{3, Float64})::Float64
││└───────────────────────────────────────────────────────────
│┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:179 distanceToOut(%12, %3, %4)
││ runtime dispatch detected: distanceToOut(%12::Shape{Float64}, %3::Point3{Float64}, %4::SVector{3, Float64})::Float64
│└───────────────────────────────────────────────────────────────
│┌ @ /home/akako/Documents/github/Geom4hep/src/Navigators.jl:185 string("Negative distanceToOut. \nstep = ", %80, " \nshape = ", %204, " \nvolstack = ", %205, " \npoint = ", %3, " \ndir =   ", %4)
││ runtime dispatch detected: string("Negative distanceToOut. \nstep = ", %80::Float64, " \nshape = ", %204::Shape{Float64}, " \nvolstack = ", %205::Vector{Int64}, " \npoint = ", %3::Point3{Float64}, " \ndir =   ", %4::SVector{3, Float64})::String
│└───────────────────────────────────────────────────────────────
┌ @ /home/akako/Documents/github/Geom4hep/examples/XRay.jl:39 string("step == -1 may indicate a navigation error. \nstate = ", state, " \npoint = ", point, " \ndir =   ", dir)
│┌ @ strings/io.jl:185 Base.print_to_string(xs...)
││┌ @ strings/io.jl:144 print(s, x)
│││┌ @ strings/io.jl:35 show(io, x)
││││┌ @ show.jl:391 Base.show_default(io, x)
│││││┌ @ show.jl:396 Base._show_default(io, Base.inferencebarrier(x))
││││││┌ @ show.jl:409 f = fieldname(t, i)
│││││││┌ @ reflection.jl:160 throw_field_access(t, i, n_fields)
││││││││┌ @ reflection.jl:153 string("Cannot access field ", i, " since type ", t, " only has ", n_fields, " ", field_label, ".")
│││││││││┌ @ strings/io.jl:185 Base.print_to_string(xs...)
││││││││││┌ @ strings/io.jl:144 print(%95, %109)
│││││││││││ runtime dispatch detected: print(%95::IOBuffer, %109::Any)::Any
││││││││││└─────────────────────
│││││││││┌ @ strings/io.jl:185 string(::String, ::Int64, ::String, ::DataType, ::String, ::Int64, ::String, ::String, ::String)
││││││││││ failed to optimize: string(::String, ::Int64, ::String, ::DataType, ::String, ::Int64, ::String, ::String, ::String)
│││││││││└─────────────────────
││││││││┌ @ reflection.jl:151 (::Base.var"#throw_field_access#11")(::DataType, ::Int64, ::Int64)
│││││││││ failed to optimize: (::Base.var"#throw_field_access#11")(::DataType, ::Int64, ::Int64)
││││││││└─────────────────────
│││││││┌ @ reflection.jl:161 throw_need_pos_int(i)
││││││││┌ @ reflection.jl:155 string("Field numbers must be positive integers. ", i, " is invalid.")
│││││││││┌ @ strings/io.jl:185 string(::String, ::Int64, ::String)
││││││││││ failed to optimize: string(::String, ::Int64, ::String)
│││││││││└─────────────────────
││││││││┌ @ reflection.jl:155 (::Base.var"#throw_need_pos_int#12")(::Int64)
│││││││││ failed to optimize: (::Base.var"#throw_need_pos_int#12")(::Int64)
││││││││└─────────────────────
│││││││┌ @ reflection.jl:159 length(%11)
││││││││ runtime dispatch detected: length(%11::Any)::Any
│││││││└─────────────────────
│││││││┌ @ reflection.jl:162 %11[i]
││││││││ runtime dispatch detected: (%11::Any)[i::Int64]::Any
│││││││└─────────────────────
││││││┌ @ show.jl:403 sizeof(x)
│││││││ runtime dispatch detected: sizeof(x::Any)::Any
││││││└───────────────
││││││┌ @ show.jl:409 fieldname(%1, %100)
│││││││ runtime dispatch detected: fieldname(%1::DataType, %100::Int64)::Union{Int64, Symbol}
││││││└───────────────
││││││┌ @ show.jl:413 show(%82, %116)
│││││││ runtime dispatch detected: show(%82::IOContext{IOBuffer}, %116::Any)::Any
││││││└───────────────
┌ @ /home/akako/Documents/github/Geom4hep/examples/XRay.jl:13 extent(%2)
│ runtime dispatch detected: extent(%2::Shape{Float64})::Tuple{Point3{Float64}, Point3{Float64}}
└─────────────────────────────────────────────────────────────
Moelf commented 1 year ago

maybe it's worth to try to re-write the algorithm so it doesn't recurse across 13 different methods, taking distanceToIn for example, I think it has 13 different methods and each can be potentially called by another (and also cross to / from distanceToOut), this may be causing problems for us.

Instead, it should be possible to write the program such that:

function distanceToIn
   while true
        keep_looking_...
        call _impl_distanceToIn
   end
   return dist
end

where:

  1. there would be 13 different methods for _impl_distanceToIn
  2. BUT, they don't call each other, instead, we look at when to stop from top level while loop
peremato commented 1 year ago

I do not see where do you see the recursion. I know that it must be in the case of Boolean shapes since the algorithm to distanceToIn/Out is made from calling the distanceToIn/Out methods for the left and right shapes. But I guess this does not contribute to bulk of the calls.

Moelf commented 1 year ago

https://github.com/peremato/Geom4hep/blob/a2d0e99af8eac9818dda9b7345e3924948d2d4da/src/CutTube.jl#L122-L124

https://github.com/peremato/Geom4hep/blob/a2d0e99af8eac9818dda9b7345e3924948d2d4da/src/Polycone.jl#L112

https://github.com/peremato/Geom4hep/blob/a2d0e99af8eac9818dda9b7345e3924948d2d4da/src/Polycone.jl#L189

https://github.com/peremato/Geom4hep/blob/a2d0e99af8eac9818dda9b7345e3924948d2d4da/src/Boolean.jl#L128

https://github.com/peremato/Geom4hep/blob/a2d0e99af8eac9818dda9b7345e3924948d2d4da/src/Volume.jl#L219

Moelf commented 1 year ago

my guess is that when the geometry parent&daughter becomes deep, some constant prop & inlining stop going through all of the calls (understandable) and allocation occurs.

ideally we can try to never call distanceToIn/Out from within distanceToIn/Out and just use a top-down loop

peremato commented 1 year ago

ideally we can try to never call distanceToIn/Out from within distanceToIn/Out and just use a top-down loop

I really do to know how to do this. A boolean shape is made of a boolean operation of two shapes of any kind, including boolean shapes. If Julia cannot handle (in an optimal way this case is really a problem).

The low performance and allocations is due, indeed to Boolean shapes. If I ignore them from the CMS geometry I come back to reasonable number of allocations.

I tried to re-organize the code differently:

I am running out of ideas.

Moelf commented 1 year ago

What if you make the Boolean three different types instead of using the symbol inside? It shouldn't matter but idk

peremato commented 1 year ago

Splitting Boolean into BooleanUnion, BooleanSubtraction and BooleanIntersection does not help. The performance numbers are identical.

I wanted to try to see what can be done with the JET.jl report, but I have a problem running it. I get the flowing error:

julia> using JET
julia> report_and_watch_file("examples/XRay.jl"; annotate_types = true)
[toplevel-info] virtualized the context of Main (took 0.137 sec)
[toplevel-info] entered into examples/XRay.jl
ERROR: AssertionError: `get_binding_type` should resolve the already-defined variable type

Any idea?

Moelf commented 1 year ago

Are you using 1.8.0?

peremato commented 1 year ago

Yes

Moelf commented 1 year ago

hmm, not sure, I'm just using

@report_opt generateXRay(nav2, world2, 1000, 1)

maybe you can try to change the file content a bit?

peremato commented 1 year ago

Yes, this way of invoking JET works. I added a few missing return types, way go looping (e.g. using eachindex(...)) but nothing makes any change.

Moelf commented 1 year ago

that's what I observed as well

charleskawczynski commented 1 year ago

Is this with Geom4hep/examples/XRay.jl example? rather, can you post an mwe (even of just the representative work)?

Moelf commented 1 year ago

@charleskawczynski

it's basically XRay.jl keep the definition and starting line 56, compare these two:

1

    const full = processGDML("examples/trackML.gdml", Float64)
    const volume = full[2,1]
    const world = getWorld(volume)
    const nav = BVHNavigator(world)
 generateXRay(nav, world, 1e4, 1)

2


    const full2 = processGDML("examples/cms2018.gdml", Float64)
    const volume2 = full2[1,7,1]
    const world2 = getWorld(volume2)
    const nav2 = BVHNavigator(world2)
generateXRay(nav2, world2, 1e4, 1)
charleskawczynski commented 1 year ago

Looks like kOutside is undefined locally. Does this take the value 2 from Geom4hep/src/BasicTypes.jl? If so, then I think this function (below) looks type-unstable. Sometimes it returns a Bool, here:

inside(bb::AABB{T}, point::Point3{T}) where T =  all(bb.min .< point .< bb.max)

and (most) others an Int. Maybe this is making inference / inlining fail (resulting in runtime allocations)?

function inside(agg::Aggregate{T}, point::Point3{T}) where T<:AbstractFloat
    for pvol in agg.pvolumes
        inout = inside(pvol.volume.shape, pvol.transformation * point)
        (inout == kInside || inout == kSurface) && return inout
    end
    return kOutside
end

It looks like this may be where jet is first failing, at least

Moelf commented 1 year ago

oh, it's assumied Geom4hep is loaded (using)

charleskawczynski commented 1 year ago

Ah, my bad, I didn't realize that the second case was the real offender with allocations

charleskawczynski commented 1 year ago

Type inference is improved with adding ::Int hints to inside:

function inside(agg::Aggregate{T}, point::Point3{T})::Int where T<:AbstractFloat
...

Before/after:

  6.296787 seconds (35.19 M allocations: 1.019 GiB, 4.40% gc time, 41.81% compilation time)
  3.686026 seconds (29.79 M allocations: 757.571 MiB, 4.34% gc time)
(-830.1:16.76969696969697:830.1, -661.1:13.355555555555556:661.1, [0.0 0.0 … 0.0 0.0; 0.0 0.39888076782051646 … 0.39888076782051646 0.0; … ; 0.0 0.39888076782051646 … 0.39888076782051646 0.0; 0.0 0.0 … 0.0 0.0])

  5.916980 seconds (35.44 M allocations: 1.032 GiB, 3.53% gc time, 43.72% compilation time)
  3.267679 seconds (29.79 M allocations: 757.571 MiB, 4.01% gc time)
(-830.1:16.76969696969697:830.1, -661.1:13.355555555555556:661.1, [0.0 0.0 … 0.0 0.0; 0.0 0.39888076782051646 … 0.39888076782051646 0.0; … ; 0.0 0.39888076782051646 … 0.39888076782051646 0.0; 0.0 0.0 … 0.0 0.0])

A combination of things may be needed

Moelf commented 1 year ago

29.79 M allocations

this is the main issue, we couldn't find what's causing it because compiler/inference allocation shows up in timing but not in any profiling, in profiler it shows up as "unknown"

Moelf commented 1 year ago

@peremato I honestly wanna do the whole thing dynamically like you would in C++, meaning use a ENUM instead of types and switch based on values

how is this even implemented in C++, can't be overload?

charleskawczynski commented 1 year ago

The next thing I ran into was failing inference at Vector3{T}(0,0,pcone.zᵢ[index]), dir). I think the issue is that iteration over aggregate volumes is potentially empty, resulting in an unknown eltype, and failure to infer T in distanceToIn. JET is saying:

  │││││┌ @ dev/clones/Geom4hep/src/Volume.jl:234 normal(pvol.volume.shape, lpoint) Geom4hep.:⋅ ldir
  ││││││┌ @ /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/generic.jl:843 LinearAlgebra.iterate(x)
  │││││││ runtime dispatch detected: LinearAlgebra.iterate(x::Nothing)::Union{}
Moelf commented 1 year ago

that looks like a JET bug, why are we calling iterate from LinearAlgebra :laughing:

peremato commented 1 year ago

@charleskawczynski Thanks for looking into it.

3.686026 seconds (29.79 M allocations: 757.571 MiB, 4.34% gc time)

3.267679 seconds (29.79 M allocations: 757.571 MiB, 4.01% gc time)

Yes, I tried to add the missing type declarations in returns and probably helps a little bit. But what I am looking is a factor 2-3 in performance. When I run the simple geometry and compare with an equivalent C++ library I get a very comparable performance (even slightly faster for Julia). However using the complex geometry the Julia implementation is 2.5 times slower. I guess mainly to the skyrocketing number of allocations.

peremato commented 1 year ago

The next thing I ran into was failing inference at Vector3{T}(0,0,pcone.zᵢ[index]), dir). I think the issue is that iteration over aggregate volumes is potentially empty, resulting in an unknown eltype, and failure to infer T in distanceToIn.

I am a bit confused here with the output of JET. AFICS I do not have instances of aggregated volumes in the test case. Does JET.jl a dynamic analysis or a static one? If it is static, then I understand, but it does not affect the runnning performance.

peremato commented 1 year ago

@peremato I honestly wanna do the whole thing dynamically like you would in C++, meaning use a ENUM instead of types and switch based on values

how is this even implemented in C++, can't be overload?

I am not sure I understand. In C++ is implemented with pointers/references to (abstract) base classes, and using virtual methods.

Moelf commented 1 year ago

image

so the world 1 only needs these four types, the number 4 is special because that's the limit of Union splitting.

What I don't understand is why manual union splitting doesn't help us

Moelf commented 1 year ago

this can be confirmed by commenting out additional types:

diff --git a/src/GDML.jl b/src/GDML.jl
index 73afee1..bf89487 100644
--- a/src/GDML.jl
+++ b/src/GDML.jl
@@ -114,18 +114,18 @@ function fillSolids!(dicts::GDMLDicts{T}, element::XMLElement) where T<:Abstract
             e = XMLElement(c)
             elemname = name(e)
             attrs = attributes_dict(e)
-            if elemname == "box"
-                lunit = eval(Meta.parse(attrs["lunit"]))
-                solids[attrs["name"]] = Box{T}(parse(T, attrs["x"]) * lunit / 2, 
-                                               parse(T, attrs["y"]) * lunit / 2, 
-                                               parse(T, attrs["z"]) * lunit / 2)
-            elseif elemname == "trd"
+            if elemname == "trd"
                 lunit = eval(Meta.parse(attrs["lunit"]))
                 solids[attrs["name"]] = Trd{T}(parse(T, attrs["x1"]) * lunit / 2, 
                                                parse(T, attrs["x2"]) * lunit / 2, 
                                                parse(T, attrs["y1"]) * lunit / 2,
                                                parse(T, attrs["y2"]) * lunit / 2,
                                                parse(T, attrs["z"])  * lunit / 2)
+            # elseif elemname == "box"
+            #     lunit = eval(Meta.parse(attrs["lunit"]))
+            #     solids[attrs["name"]] = Box{T}(parse(T, attrs["x"]) * lunit / 2, 
+            #                                    parse(T, attrs["y"]) * lunit / 2, 
+            #                                    parse(T, attrs["z"]) * lunit / 2)
             elseif elemname == "trap"
                 lunit = eval(Meta.parse(attrs["lunit"]))
                 aunit = eval(Meta.parse(attrs["aunit"]))
@@ -184,27 +184,27 @@ function fillSolids!(dicts::GDMLDicts{T}, element::XMLElement) where T<:Abstract
                                                    parse(T, attrs["deltaphi"]) * aunit,
                                                    Vector3{T}(parse(T, attrs["lowX"]), parse(T, attrs["lowY"]), parse(T, attrs["lowZ"])),
                                                    Vector3{T}(parse(T, attrs["highX"]), parse(T, attrs["highY"]), parse(T, attrs["highZ"])))
-            elseif elemname in ("union", "subtraction", "intersection") 
-                first = nothing
-                second = nothing
-                transformation = getTransformation(dicts, e)
-                for cc in child_nodes(e)
-                    if is_elementnode(cc)
-                        aa = attributes_dict(XMLElement(cc))
-                        if name(cc) == "first"
-                            first = solids[aa["ref"]]
-                        elseif name(cc) == "second"
-                            second = solids[aa["ref"]]
-                        end
-                    end
-                end
-                if elemname == "union"
-                    solids[attrs["name"]] = BooleanUnion(first, second, transformation)
-                elseif elemname == "subtraction"
-                    solids[attrs["name"]] = BooleanSubtraction(first, second, transformation)
-                elseif elemname == "intersection"
-                    solids[attrs["name"]] = BooleanIntersection(first, second, transformation)
-                end
+            # elseif elemname in ("union", "subtraction", "intersection") 
+            #     first = nothing
+            #     second = nothing
+            #     transformation = getTransformation(dicts, e)
+            #     for cc in child_nodes(e)
+            #         if is_elementnode(cc)
+            #             aa = attributes_dict(XMLElement(cc))
+            #             if name(cc) == "first"
+            #                 first = solids[aa["ref"]]
+            #             elseif name(cc) == "second"
+            #                 second = solids[aa["ref"]]
+            #             end
+            #         end
+            #     end
+            #     if elemname == "union"
+            #         solids[attrs["name"]] = BooleanUnion(first, second, transformation)
+            #     elseif elemname == "subtraction"
+            #         solids[attrs["name"]] = BooleanSubtraction(first, second, transformation)
+            #     elseif elemname == "intersection"
+            #         solids[attrs["name"]] = BooleanIntersection(first, second, transformation)
+            #     end
             else
                 @printf "Shape %s not yet suported. Using NoShape\n" elemname
                 solids[attrs["name"]] = NoShape{T}()

and we get same performance between world 1 and world 2


julia> mainbench()
  0.012849 seconds (9 allocations: 80.453 KiB)
  0.116273 seconds (9 allocations: 80.453 KiB)
Moelf commented 1 year ago

note to self, the problem is Boolean*, not number of shapes:

https://github.com/peremato/Geom4hep/blob/76c787a8ec2ae51a258cbe8b9b08b9961419cc46/src/Volume.jl#L16-L18

here Boolean* only has a single {T}, but Boolean stuff themselves have 3 parametric types: https://github.com/peremato/Geom4hep/blob/76c787a8ec2ae51a258cbe8b9b08b9961419cc46/src/Boolean.jl#L2-L7

and the use of const Shape{T} = Union{NoShape{T}...} inside definition of: https://github.com/peremato/Geom4hep/blob/76c787a8ec2ae51a258cbe8b9b08b9961419cc46/src/Volume.jl#L21-L26

might be what's stopping manual splitting from working -- because the compiler still would split for this type, regardless of what our methods say

Moelf commented 1 year ago

further more, it's one of {BooleanUnion, BooleanSubtraction}

grasph commented 1 year ago

Hello Pere and Jerry,

You can find a version without the abstract type removed from the Boolean* structs here:

https://github.com/grasph/Geom4hep-fork/tree/remove-abstract-type-from-container

Allocations are reduced to 9 (7.6MiB) for the x-projection. Time performance is improved by 10% only.

You can find in the link below an optimization that removed a costly objecid hash calculation on top of the allocation one:

https://github.com/grasph/Geom4hep-fork/tree/removed-objectid

Overall the time improvement is 30%, far from the desired x6 factor.

Measurements done with @time macro:

cms2018 geometry
version warm-up
100 points
[s]
meas.
1e6 points
[s]
mem alloc gc
time
master 0.34 335 2.91 G (72.2 GiB) 2.06%
abstract-type-removed 0.023 303 9 (7.63 MiB) 0
removed-objectid 0.016 210 9 (7.632 MiB) 0

mem. alloc and gc time correspond to the measurement run after the warm-up run.

trackML geomtry
version warm-up
100 points
[s]
meas.
1e6 points
[s]
mem alloc gc
time
master 0.000093 1.30 9 (7.6 MiB) 0
abstract-type-removed 0.00010 1.34 9 (7.6 MiB) 0
removed-objectid 0.000048 0.62 9 (7.6 MiB ) 0

Philippe.

peremato commented 1 year ago

@grasph It is very good that you eliminated the giga allocations. It is a bit convoluted but it gives the results. Now is probably a problem of dynamic dispatch that we the help of manual split perhaps can give some results. The optimization of not using objectid() is also very good.

Moelf commented 1 year ago

in 20/20 hindsight, it was expected to only speed up slightly if we ONLY eliminate the allocation of dynamic dispatch (and that the GC % was a small number in the original case).

After all, the difference is more or less just the GC, we are still letting Julia's dynamic dispatch jumping through hoops, but I totally agree this is fantastic -- it's much easier to incrementally manually split out dynamic dispatch when the allocation is not there; before, I had a felling that I need to do everything at once in order to see speed up and remove allocation