Open NHDaly opened 2 years ago
I'm wondering: would it be acceptable to put a return type annotation and/or return-site type assertion on all these methods asserting that they return a UInt64?
I am not sure that helps, for example:
struct S1 end
struct S2 end
struct S3 end
struct S4 end
struct S5 end
for s in [:S1, :S2, :S3, :S4, :S5]
@eval g(x::$s) = 0
end
f(x) = g(x[])
code_warntype(f, Tuple{Ref{Any}})
shows a return value of Any
even though all methods are concretely inferred (code_warntype(g, Tuple{Any})
). So I think the problem is that there are too many methods for hash
and type asserting them does not help.
gotttttttttcha. Okay thanks, that makes sense! 👍
It's too bad, because we could even still a ::Int
at the call site, since we know for suresies that we're gonna get 0
back, but i don't think that would help either.. It still does the allocation and then just does a type assert.
🤔 it would be great if there was like a "dispatch-lite" where if you could know the type at the callsite, you could maybe still do dynamic dispatch, but call the julia_g_
version of the function instead of the jlfptr_g_
version that allocates? 🤔 Like jl_apply_generic_with_asserted_type(f, args, asserted_type)
and it would check that the method it matches will return the type you expect before calling it? is that crazy?
Are the subtypes of the abstract element type known upfront? In that case, you could call out to a manually union split function which should be type stable since it doesn't include any dynamic calls. Something like https://github.com/jlapeyre/ManualDispatch.jl.
We don't really have an optimization right now for "Do a dynamic dispatch, but all possible specializations are known to return the same type". If we did, you could insert a specialization barrier like:
diff --git a/base/abstractarray.jl b/base/abstractarray.jl
index 1b21201ffa..0e5ffce624 100644
--- a/base/abstractarray.jl
+++ b/base/abstractarray.jl
@@ -3002,6 +3002,7 @@ pushfirst!(A, a, b, c...) = pushfirst!(pushfirst!(A, c...), a, b)
## hashing AbstractArray ##
const hash_abstractarray_seed = UInt === UInt64 ? 0x7e2d6fb6448beb77 : 0xd4514ce5
+hash_known_uint(x, h::UInt) = hash(x, h)::UInt
function hash(A::AbstractArray, h::UInt)
h += hash_abstractarray_seed
# Axes are themselves AbstractArrays, so hashing them directly would stack overflow
@@ -3012,7 +3013,7 @@ function hash(A::AbstractArray, h::UInt)
# For short arrays, it's not worth doing anything complicated
if length(A) < 8192
for x in A
- h = hash(x, h)
+ h = hash_known_uint(x, h)
end
return h
end
@@ -3048,7 +3049,7 @@ function hash(A::AbstractArray, h::UInt)
n += 1
# Hash the element
elt = A[keyidx]
- h = hash(keyidx=>elt, h)
+ h = hash_known_uint(keyidx=>elt, h)
# Skip backwards a Fibonacci number of indices -- this is a linear index operation
linidx = key_to_linear[keyidx]
And get a specialized dynamic dispatch for hash_known_uint
that wouldn't need to allocate its return value. As it stands though, whenever you have a dynamic dispatch, you need to allocate the return value in order to fit through jl_apply_generic
, which has a jl_value_t *
return.
That said, it could definitely be implemented though, but it's a bit of work.
Would something like #40880 help with this? That is, would it be possible for the compiler to do the following inference:
f
returns Any
f
passed to g
, which only accepts UInt
f
, throwing if it's not an UInt
, and if it is, we can store it on the stack since we know the type to be UInt
for sure.I may be talking out of my ass here.
Yes, this is a case where reverse dataflow could be helpful. It's a bit hard to use the information though --- we can't literally insert a typeassert, because if f
doesn't return a UInt we need to give a method error for g
instead of a type error.
Manually adding type declarations in these cases can still be useful though since it will at least speed up subsequent operations. I think it's worth doing for containers likely to be used with abstract element types, e.g. Vector and Dict.
We do all that already (with inserted typeassert and MethodError), we just don't have the complicated version where it also manages to avoid the allocation by shifting that error around in somewhat complicated ways.
An update here:
The hash allocs saga continues. (We still continue to see UInt allocations from hash showing up at RAI a lot.)
I had a clever (I think?) idea yesterday to avoid the allocation on every element by switching the dynamic dispatch to a function that doesn't have any return value, and then allowing that to have a static dispatch to hash
.
function my_hash(vec::Vector, h::UInt)
h += Base.hash_abstractarray_seed
h = hash(map(first, axes(vec)), h)
h = hash(map(last, axes(vec)), h)
# Use a Ref to avoid allocations from the return value of hash():
ref = Ref{UInt}(h)
for e in vec
hash_into!(ref, e)::Nothing
end
h = ref[]::UInt
return h::UInt
end
function hash_into!(ref::Ref{UInt}, v::T) where T # (force specialization)
ref[] = my_hash(v, ref[])
return nothing
end
my_hash(a, h::UInt) = hash(a, h)
It seems this helps a lot! Even in the case with very small data / a single element:
julia> v = Any[i for i in 1:1_000];
julia> @btime hash($v, $(UInt(0)))
16.552 μs (1001 allocations: 15.64 KiB)
0x0e97df552a8d635e
julia> @btime my_hash($v, $(UInt(0)))
9.950 μs (1 allocation: 16 bytes)
0x0e97df552a8d635e
julia> v = Any[i for i in 1:1_000_000];
julia> @btime hash($v, $(UInt(0)))
26.136 ms (330659 allocations: 5.68 MiB)
0xb5d91dc36a23d5be
julia> @btime my_hash($v, $(UInt(0)))
10.870 ms (1 allocation: 16 bytes)
0x3eddf02e950af97c
julia> @btime hash($(Any[1]), $(UInt(0)))
28.156 ns (2 allocations: 32 bytes)
0xc7253bdde85cb9f5
julia> @btime my_hash($(Any[1]), $(UInt(0)))
20.572 ns (1 allocation: 16 bytes)
0xc7253bdde85cb9f5
(For a 1,000,000 elements, it's faster, even without the optimizations that Base.hash has for very large vectors (which is the reason for the different return value in that case).)
So:
hash()
for container types to use this approach? (Maybe only when their eltype is an abstract type.)Ref
? I know for a fact that it won't escape, so it could be stack allocated. But it seems that escape analysis isn't able to determine that through the dynamic dispatch? 😢
Thanks! :)
And assuming that we cannot get rid of the Ref alloc, there is still a bit more improvement that can be made here for nested containers by adding hash_into!()
methods for container types that pass through the Ref rather than allocating another one at each level:
julia> v = Any[ Any[i] for i in 1:1_000 ];
julia> @btime hash($v, $(UInt(0)))
37.920 μs (2001 allocations: 31.27 KiB)
0x5bb22512b3f4458e
julia> @btime my_hash($v, $(UInt(0)))
30.817 μs (1001 allocations: 15.64 KiB)
0x5bb22512b3f4458e
julia> # Add a custom method to pass-through the ref instead of allocating a new one
function hash_into!(ref::Ref{UInt}, vec::Vector{T}) where T # (force specialization)
ref[] += Base.hash_abstractarray_seed
ref[] = hash(map(first, axes(vec)), ref[])
ref[] = hash(map(last, axes(vec)), ref[])
for e in vec
hash_into!(ref, e)::Nothing
end
return nothing
end
hash_into!
julia> @btime my_hash($v, $(UInt(0)))
23.888 μs (1 allocation: 16 bytes)
0x5bb22512b3f4458e
(And we can then of course define my_hash()
in terms of hash_into!()
, so we don't have to repeat ourselves):
function my_hash(vec::Vector, h::UInt)
# Use a Ref to avoid allocations from the return value of hash():
ref = Ref{UInt}(h)
hash_into!(ref, vec)
return ref[]::UInt
end
This approach would basically mean one of these hash_into!()
implementations for all containers of abstract types. But we'd probably want that anyway, even in the previous post, where we are able to eliminate the Ref alloc.
See discussion here, from JuliaCon hackathon: https://github.com/JuliaLang/julia/pull/50136#issuecomment-1656811849
We also talked about the general problem in https://github.com/JuliaLang/julia/issues/44244.
- Using something like
hash_into!(::Ref, v)
seems like a decent workaround.- But it'd be nicer to fix this in the language, via something like the above proposal.
- We'd really also need to fix the return type allocations... For that we have a few possibilities:
- function-level return type declarations (
function hash::UInt end
)- preallocating a return value slot on the stack, via call-site return type assertions
- ... there was a variant on 2. that i don't quite remember.
Hi!
Here's a very small MRE that reproduces the issue I'll discuss below:
And you can see that all those allocations are UInts:
We've been using the new allocations profiler on our codebase, and we've found that on some benchmarks we are spending roughly 30% of allocations on this method instance:
hash(::Vector{RelationalAITypes.DBType}, ::UInt64)
:You can see that of these allocations, roughly 50% are allocating the UInt64s of the results of iterating the vector.
Here's the reported allocation locations:
You can see that most of the direct allocations come from the call to
hash(elt, h)
.I know that this is because there is a type instability there, since the Vector is a vector of abstract elements. But my question is: even if you don't know the type of the element, why can't julia deduce the return type to be a UInt in this case? And if it could, would that be enough to avoid allocating the return value?
I see that currently roughly 1/3 of the
hash()
methods in base cannot infer their return type:Here's a small sample of the ones that it reports with return type Any:
I'm wondering: would it be acceptable to put a return type annotation and/or return-site type assertion on all these methods asserting that they return a UInt64?
One plausible explanation might be that we don't require
hash(x, h)
to return a UInt, but I don't think that's true, because if the hash function returns anything other than a UInt, it will cause a MethodError when the returned hash value is passed along as the seed to the next call to hash()!Consider this example:
So it seems to me that there is an (implicit) requirement in the current API that Base.hash functions must return a UInt64. So my question is can we then take advantage of that requirement to provide a hint to the compiler that can eliminate the allocations in the presence of this type instability?