Open mjram0s opened 3 years ago
Welcome, and great first issue!
This is a really interesting edge case: since MyType
has, in a sense, two types (DataType
and Type{MyType}
), which should we pick? On balance I lean away from greater specificity here because in many cases where you are, e.g., building a dictionary from types you actually want to veer sharply in the direction of type-broadening: it's typically far better to use something like an IdDict{Any,String}
which has @nospecialize
d all of its argument types. This is because Type
is essentially unbounded in how many specializations you'd require, and it's a huge source of compilation latency if you end up generating them all.
As an example:
tim@diva:/tmp$ juliam --startup-file=no -q
julia> d = Dict{Any,String}()
Dict{Any, String}()
julia> @time for T in subtypes(Any)
d[Core.Typeof(T)] = string(T)
end
8.550475 seconds (31.43 M allocations: 1.693 GiB, 3.33% gc time, 98.73% compilation time)
vs with plain typeof
:
tim@diva:/tmp$ juliam --startup-file=no -q
julia> d = Dict{Any,String}()
Dict{Any, String}()
julia> @time for T in subtypes(Any)
d[typeof(T)] = string(T)
end
0.275490 seconds (457.37 k allocations: 24.886 MiB, 17.62% gc time, 90.59% compilation time)
It's even better with an IdDict
, even when you use Core.Typeof
:
tim@diva:/tmp$ juliam --startup-file=no -q
julia> d = IdDict{Any,String}()
IdDict{Any, String}()
julia> @time for T in subtypes(Any)
d[Core.Typeof(T)] = string(T)
end
0.134169 seconds (209.96 k allocations: 10.859 MiB, 80.31% compilation time)
In all cases, it's the number of method specializations that drives the difference. Since DataType
is just one type, the typeof(T)
solution gets you most of the benefit of the IdDict
, but you'd really want the IdDict
if you were using T
itself as the key:
tim@diva:/tmp$ juliam --startup-file=no -q
julia> d = Dict{Any,String}()
Dict{Any, String}()
julia> @time for T in subtypes(Any)
d[T] = string(T)
end
8.072373 seconds (31.42 M allocations: 1.692 GiB, 3.62% gc time, 99.22% compilation time)
julia>
tim@diva:/tmp$ juliam --startup-file=no -q
julia> d = IdDict{Any,String}()
IdDict{Any, String}()
julia> @time for T in subtypes(Any)
d[T] = string(T)
end
0.134071 seconds (209.84 k allocations: 10.852 MiB, 81.47% compilation time)
Thank you for insight. I understand why the current implementation is optimal and there are workarounds for my issue. This would just be a nice-to-have if possible 😄
xref #29368
It would be nice to support dispatch on Pairs defined in the form
a=>b
, for example:This could potentially be solved by using
Core.Typeof
in thePair
constructor.