Open NHDaly opened 2 years ago
For more exposition - this is more or less the code that I expected the original example to generate (except that I would have expected the type tag to be multiplexed into the String's pointer, so the total size would be 16 bytes not 24). As you can see, it does not allocate when constructing the returned struct:
julia> struct TaggedUIntOrString
i::UInt64
s::String
isint::Bool # true if `i`, false if `s`
TaggedUIntOrString(i::Number) = new(i, "", true)
TaggedUIntOrString(s::AbstractString) = new(0, s, false)
end
julia> TaggedUIntOrString(5)
TaggedUIntOrString(0x0000000000000005, "", true)
julia> TaggedUIntOrString("Hello")
TaggedUIntOrString(0x0000000000000000, "Hello", false)
julia> function make_str(x)
# For small strings, stick the data into an inline UInt (the goal is to avoid
# allocations, but it surprisingly still allocates).
if sizeof(x) < 8
v = UInt64(0)
for (i,c) in enumerate(codeunits(x))
v |= (UInt64(c) << (i*sizeof(c)*8))
end
return TaggedUIntOrString(v)
else
return TaggedUIntOrString(string(x))
end
end
make_str (generic function with 1 method)
julia> using BenchmarkTools
julia> @btime make_str("hi") # Hooray, no allocation!
10.307 ns (0 allocations: 0 bytes)
TaggedUIntOrString(0x0000000000696800, "", true)
julia> @btime make_str("this is a really long string, and it gets passed-in directly, so no allocs i guess")
10.959 ns (0 allocations: 0 bytes)
TaggedUIntOrString(0x0000000000000000, "this is a really long string, and it gets passed-in directly, so no allocs i guess", false)
julia> @btime make_str(@view("hi"[1:2]))
28.511 ns (0 allocations: 0 bytes)
TaggedUIntOrString(0x0000000000696800, "", true)
julia> @btime make_str(@view("but like, what is going on my dude"[1:20]))
36.557 ns (1 allocation: 48 bytes)
TaggedUIntOrString(0x0000000000000000, "but like, what is go", false)
Also, weirdly, it seems that the original struct definition has a reported size of 8 bytes
, which I think is consistent with the fact that julia is planning to box the UInt or String:
julia> struct UnionUIntOrString
v::Union{UInt64,String}
end
julia> sizeof(UnionUIntOrString)
8
I expected it to be more similar to this Union{UInt64,Nothing}
, which is 16 bytes because it needs to keep a separate boolean tag bit because every bit of the UInt64 is used so there's no where to multiplex (whereas a Union{String,Nothing}
can be only 8 bytes since it can represent the nothing via a sentinal value in the string pointer):
julia> struct UnionUIntOrNothing
v::Union{UInt64,Nothing}
end
julia> sizeof(UnionUIntOrNothing)
16
julia> struct UnionStringOrNothing
v::Union{String,Nothing}
end
julia> sizeof(UnionStringOrNothing)
8
I've boiled down this surprising behavior into an MRE as follows.
I'm surprised that this function is boxing and allocating the UInt for this small union. A simpler function, which I'll show later in this post, doesn't allocate as expected. Similarly, a non-union struct doesn't allocate. So i'm not sure what it is about the logic in this function that forces the Int to get boxed?
I think this seems like an area for improvement, which is why i'm posting this as an issue. Thanks! :)
The MRE:
@code_llvm
says that it's boxing the UInt for some reason:Full LLVM IR here: https://gist.github.com/NHDaly/84b685d302dd77e67dbbe56cf3f2fe06
...
As you can see, with a simpler function, there is no allocation for the UInt case (although weirdly now there's two allocs for the string case? 🤔):
Finally, when there's no Union, we don't need to box the UInt for the returned struct:
Can anyone help me understand what's happening here, and whether this could be improved?
My understanding was that we essentially need to store 8 bytes either way: for the UInt or for the String's pointer, and so my expectation was that we wouldn't need to allocate the UInt, and the compiler would essentially include a boolean alongside the value in the struct's definition which acts like a tag and indicates how to interpret the value.
Is there a reason that isn't working in this case? Thank you!