georust / geozero

Zero-Copy reading and writing of geospatial data.
Apache License 2.0
322 stars 30 forks source link

Consider using pure rust protobuf codegen #118

Closed nyurik closed 11 months ago

nyurik commented 1 year ago

protoc is one of the common ways to parse .proto files, but it comes with a cost -- each user of the crate must have it installed in their environment, which sometimes gets tricky for embedded and other setups. On the other hand, protobuf-codegen can do the same thing using Rust-only code as part of the build.rs step, and not rely on any native libs.

I have used that lib extensively (e.g. in the osm pbf parsing, and I think it might make much more sense for us to use it for MVT parse/gen.

Are there any concerns with switching to the protobuf instead of prost? I have not heard (but have not verified) any performance concerns. cc: @pka

nyurik commented 1 year ago

I have ran some tests using bench-mvt branch (see diff). It seems the protobuf implementation is ~24% slower, so unless there are some really strong reasons to switch, may keep it as is for now...

decode MVT using prost
                        time:   [18.426 µs 18.621 µs 18.831 µs]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

decode MVT using protobuf
                        time:   [22.870 µs 23.115 µs 23.388 µs]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe
gibbz00 commented 1 year ago

quick-protobuf you mentioned in the linked issue seems like a good candidate :) saw it being added as a dependency in the bench-mvt branch... did you end up getting any benchmarks from it too?

nyurik commented 1 year ago

@gibbz00 i tried it too, if my memory serves me right (you can try them too with the bench-mvt branch) - quick-protobuf was also slower. I am not certain why this is -- it might actually be due to some extra fields being created by other libs to store un-expected values (the current lib simply ignores them IIRC)