Closed lassepe closed 2 years ago
From what I understand, this looks like the Dict
used to store members of a protobuf struct? E.g. this? Would you be able to check/put up a more detailed profiler stack to confirm this?
In that case, maybe something more lightweight than a Dict will help here. The count and names of fields is known during code generation, so one way could be to generate specialized code for each struct instead of using Dict.
Yes, the bottleneck is due to the dict that is used to represent the protobuf stuct. So there is not really a quick-fix for this because it would require a fundamentally different way of generating the code.
I created a MWE that also includes some results in the profile_results subdirectory. I'm not sure what else would be a good way to share more information on this because the flame graph is not really helpful unless you interact with it (i.e. hover etc.). I think you can ignore the large chunk on the bottom right in this chart because it is just the overhead that comes from running things in the REPL. The actual workload is the part on the left.
Hi guys, wondering what the patterns in protobuf.jl is to deserialise multiple sub messages in the one message? (not sure if this i should open a separate issue)
I'm seeing that this is not officially part of the protobuf spec https://developers.google.com/protocol-buffers/docs/techniques and couldn't see it in any of the Readme's.
We have the same issue with OpenStreetMapX: https://github.com/pszufe/OpenStreetMapX.jl/pull/44#discussion_r667459615
The count and names of fields is known during code generation, so one way could be to generate specialized code for each struct instead of using Dict.
Is there any technical issue blocking the code of the struct to be generated without using Dict but fields instead?
Technically it should be possible. The point is just that one would have to re-write the entire code-generation logic.
The point is just that one would have to re-write the entire code-generation logic.
Indeed. It's best to try it out by manually doing it for one package. I tried it by manually changing the generated code of OpenStreetMapX, see here for the proof of concept: https://github.com/pszufe/OpenStreetMapX.jl/pull/52 It was quite simple since the PBF specification of OpenStreetMap does not use oneof but it should be doable for oneof as well.
Hi all, we're seeing the same thing here: allocating a small Dict per message is prohibitively expensive.
We remembered that the last time we used ProtoBuf.jl, a few years ago (for JuliaPerf/PProf.jl@d6d525c), it generated a struct
per proto message, instead of an object containing a Dict.
We found that this change from struct to Dict was made as part of https://github.com/juliaio/protobuf.jl/pull/141, which fixed a thread-safety issue regarding the use of meta
. It seems that that PR also changed to a Dict representation, but the PR description doesn't contain any explanation about why we made that change.
@tanmaykm can you elaborate on how we came to the current design with a Dict in each object? Why did we change away from generating a struct per message? We'd be interested to help with implementing any future work here to improve the performance, and we'd like to help get rid of this allocation. 😊
Thanks so much! :)
CC: @dewilson
For this example proto, here's what it looks like before and after #141 Proto:
syntax = "proto3";
message plant {
string name = 3;
}
message seed {
int32 cold_hours = 1;
int32 germination_hours = 2;
plant species = 3;
}
Here's the seed
proto after #141:
mutable struct seed <: ProtoType
__protobuf_jl_internal_meta::ProtoMeta
__protobuf_jl_internal_values::Dict{Symbol,Any}
function seed(; kwargs...)
obj = new(meta(seed), Dict{Symbol,Any}())
values = obj.__protobuf_jl_internal_values
symdict = obj.__protobuf_jl_internal_meta.symdict
for nv in kwargs
fldname, fldval = nv
fldtype = symdict[fldname].jtyp
(fldname in keys(symdict)) || error(string(typeof(obj), " has no field with name ", fldname))
values[fldname] = isa(fldval, fldtype) ? fldval : convert(fldtype, fldval)
end
obj
end
end # mutable struct seed
const __meta_seed = Ref{ProtoMeta}()
function meta(::Type{seed})
ProtoBuf.metalock() do
if !isassigned(__meta_seed)
__meta_seed[] = target = ProtoMeta(seed)
allflds = Pair{Symbol,Union{Type,String}}[:cold_hours => Int32, :germination_hours => Int32, :species => plant]
meta(target, seed, allflds, ProtoBuf.DEF_REQ, ProtoBuf.DEF_FNUM, ProtoBuf.DEF_VAL, ProtoBuf.DEF_PACK, ProtoBuf.DEF_WTYPES, ProtoBuf.DEF_ONEOFS, ProtoBuf.DEF_ONEOF_NAMES)
end
__meta_seed[]
end
end
function Base.getproperty(obj::seed, name::Symbol)
if name === :cold_hours
return (obj.__protobuf_jl_internal_values[name])::Int32
elseif name === :germination_hours
return (obj.__protobuf_jl_internal_values[name])::Int32
elseif name === :species
return (obj.__protobuf_jl_internal_values[name])::plant
else
getfield(obj, name)
end
end
And before:
mutable struct plant <: ProtoType
name::AbstractString
plant(; kwargs...) = (o=new(); fillunset(o); isempty(kwargs) || ProtoBuf._protobuild(o, kwargs); o)
end #mutable struct plant
const __fnum_plant = Int[3]
meta(t::Type{plant}) = meta(t, ProtoBuf.DEF_REQ, __fnum_plant, ProtoBuf.DEF_VAL, true, ProtoBuf.DEF_PACK, ProtoBuf.DEF_WTYPES, ProtoBuf.DEF_ONEOFS, ProtoBuf.DEF_ONEOF_NAMES, ProtoBuf.DEF_FIELD_TYPES)
mutable struct seed <: ProtoType
cold_hours::Int32
germination_hours::Int32
species::plant
seed(; kwargs...) = (o=new(); fillunset(o); isempty(kwargs) || ProtoBuf._protobuild(o, kwargs); o)
end #mutable struct seed
:) We're wondering the same thing! I think it probably is related to trying to support protobuf reflection in a good way, or may have been related to supporting mutually recursive message definitions. Whatever the reason, we'd be happy to help with some group engineering to get to a solution we all love! 😊
I was hoping that some sort of lightweight dict could be used here, but that has not happened yet unfortunately. @NHDaly has rightly pointed out reasons for switching to a dict based representation.
It will sure be great to have folks contribute to this!
Thanks for the explanation, @tanmaykm! :) Makes sense.
Okay, I think we should be able to achieve those goals without having dictionaries in the structs. The proto implementations in other languages have all found ways to work around these, so I think we should be able to do so as well.
For the recursive struct definitions one: I think the standard way to do mutually recursive struct definitions is to introduce a type parameter for one of them.
So for example, given this proto definition:
message RecurseA {
required string a = 1;
optional RecurseB b = 2;
}
message RecurseB {
required string b = 1;
optional RecurseA a = 2;
}
You could generate julia code like this:
struct _RecurseA{__RecurseB}
a::AbstractString
b::__RecurseB
end
struct RecurseB
b::AbstractString
a::_RecurseA{RecurseB}
end
const RecurseA = _RecurseA{RecurseB}
I think we can work through the remaining issues, and go back to doing code generation while still avoiding the thread safety and recursion issues you had before! :)
Thanks Tanmay.
For the field name issue, we now have this syntax:
struct foo
var"invalid field name!!"::Int32
end
Hey folks, I encourage you to take the new version (1.0) of this package for a spin, should be a bit faster. Please see the docs for some guidance on how to migrate to this new version. Closing for now as this issue is a little all over to place, but please, feel free to reopen (or better, start a new issue) if you feel the performance is still bad.
FWIW, I returned to this recently to re-assess the feasibility of using ProtoBuf for another project. The MWE from above now only takes about 4ms on my machine --- a solid 30x improvement.
Thanks, @Drvi and team!
First of all, thank you for making ProtoBuf available in Julia! I am currently using this package to read scenarios form the waymo-open-dataset. While this has worked reliably, I am running into quite some performance issues due the "topology" of the messages.
Some relevant details:
Each of the scenarios consists of approximately 2MB of binary data. However, each scenario contains map features which in turn consist of some meta-data and a number of MapPoints. Each map point only encodes three floats for the x, y, and z position. Thus they are quite small. However, a single scenario can contain > 100.000 MapPoints (spread over the different features of the map).
As a result, loading a single scenario takes about 150ms on my machine. This may not sound all too bad but in my current usecase I need to iterate over about 400.000 of these scenarios; potentially multiple times.
After doing some profiling, I found that 60% of the time is spent pre-allocating memory for all the small dicts that are needed for each
MapPoint
object. Specifically, most of the time is spent in this line of julia base.Just to provide some comparison: The same kind of data is loaded in about 11ms from a json file, and in about 3ms from the binary format created by
Serialization.serialize
. Of course, there are other issues with these kinds of formats. Therefore, I would really like to use ProtoBuf here. But I cannot get it up to speed.Are there known ways to reduce the amount of overhead that arises in this setting with many small proto sub-messages? Or is this issue inherent to the current design of the library?