djkoloski / rust_serialization_benchmark

Benchmarks for rust serialization frameworks
514 stars 48 forks source link

Help with adding ε-serde #55

Closed vigna closed 2 weeks ago

vigna commented 6 months ago

We would like to put together a PR adding ε-serde to the suite. Is there some documentation on how to do that, or can you give us some basic guidance and we start from there?

djkoloski commented 6 months ago

Here's a brief summary of what you need to do to add a new framework to these benchmarks:

  1. Add your framework to Cargo.toml. It should pin itself to an exact version (e.g. 1.2.3) and set optional = true.
  2. Choose which datasets you can support. The datasets are located in src/datasets
  3. For each dataset, add ser/de support for the structs in mod.rs. If your framework has a derive, add it with #[cfg_attr(feature = "my_framework", derive(my_framework::Serialize))]. If your framework has a custom schema format, you'll have to modify the build.rs to compile Rust code from your schema format.
  4. Add a bench function to src. This should be in a module named bench_my_framework, and you'll need to also add it as a pub mod in src/lib.rs.
  5. Add your bench function to `bench.rs for each dataset you support.

Abomonation is alphabetically first and supports all of the benchmark suites, so it's a good example of what you need in your bench function. I'd recommend looking at a bench function for a framework that's similar to yours and basing it on that one.

If you have any other questions I'd be happy to answer them!

kitsuniru commented 6 months ago

@vigna i tried to implement epserde into this benchmark

it cant be done via derive because of:

proc-macro derive panicked
message: not yet implemented: Missing implementation for union, enum and tuple types

derive definition:

#[cfg_attr(feature = "epserde", derive(epserde::Epserde))]
vigna commented 6 months ago

Ok. By any chance, do you know whether the problem is due to a union, enum, or tuple type?

kitsuniru commented 6 months ago

@vigna to enum type (context: error caused for EntityType and GameType enums)

vigna commented 6 months ago

I see. We'll try to implement enums and get back to you!

vigna commented 5 months ago

@hot-moms : we just implemented enums in the current version on github (which uses the current derive library by picking it up directly with a path). Do you need an official version on crates.io to continue with the implementation? We can do it, but it would be maybe better if you could try to use the current one on github as maybe we'll need to adjust it depending on your feedback. Thoughts?

kitsuniru commented 5 months ago

@vigna as @djkoloski said above:

It should pin itself to an exact version (e.g. 1.2.3)

but as a draft, i can try, but you should wait a little bit

vigna commented 5 months ago

Ok, you can use epserde 0.3.0 and it should work with enums.

vigna commented 4 months ago

Did you make any progress? Can we help in any way?

vigna commented 3 months ago

Gentle ping :).

kitsuniru commented 3 months ago

@vigna, sorry, no time for this, try to do as @djkoloski wrote here https://github.com/djkoloski/rust_serialization_benchmark/issues/55#issuecomment-1786325870

This's the steps that exactly i was doing when implementing some serialization systems, including e-serde

vigna commented 3 months ago

Ok, I have completed a first implementation of the mk48 test here.

vigna commented 3 months ago

I'm trying to understand the meaning of "access" and "read". Probably their meaning is linked to some unspecified decomposition of the actions a zero-copy framework performs. For example, I can see that "access" for rkyv is <1ns, so it's doing nothing. I don't even know what access is in my case—you get a Rust object upon deserialization and that's it.

There is a gotcha—I had to add parameters to the data structures, with a default equal to the current type, as ε-serde needs to be able to replace the types. This is not a problem with almost all frameworks, but a couple (I still have to check one by one) do not welcome the idea. I could make the variant of the data structure depending on the feature "epserde", so that you can manually benchmark ε-serde with other frameworks accepting parameters. It wouldn't be part of the default.

djkoloski commented 3 months ago

The access and read benchmarks are zero-copy specific. "Access" measures how long it takes to provide access to some zero-copy data. Validation overhead may take some time, so that gets measured here. "Read" measures how long it takes to read some zero-copy data. This usually just entails sending the read data to a black_box call so the read doesn't get optimized out. I'll document this in a more accessible way.

I looked at your linked repo and didn't see the data structure parameters you mentioned. Are those pushed?

vigna commented 3 months ago

I forgot to tell you—you have to look at the "epserde" branch.

Yes, but once again, "read" measure how long it takes... from where? Raw bytes? Fully deserialized object? Partially deserialized object?

Presently I have a test called "read (from deser)" that reads the data from a deserialized source. Some benchmarks include deserialization time in read time, other don't. I think it would be better to not have deser operations in that benchmark, because they dwarf the scanning time.

Another issue for me is that framework using relative pointers or some other dynamic relocation technique will be unaffected by an iteration test. A random-access test would probably be more discriminative.

vigna commented 3 months ago

Another major issue that makes me feel like I'm walking on eggs is that there is no standard for reading and accessing memory. For example, the read test of rkyv for mesh gives

mesh/rkyv/read (unvalidated)
                        time:   [38.796 µs 38.809 µs 38.824 µs]

Wow. That's... faster than the speed of light. Abomonation and ε-serde are at 100 µs, and they're just scanning memory.—rkyv has even pointer indirection to handle. How's that possible?

Use the Source, Luke.

        |mesh| {
            for triangle in mesh.triangles.iter() {
                black_box(&triangle.normal);
            }
        },

This is the read test function for rkyv. Note the ampersand. This test is scanning how fast you can enumerate pointers, not how fast you can access data. It is easy to see: replace with

        |mesh| {
            for triangle in mesh.triangles.iter() {
                black_box(triangle.v0.x);
                black_box(triangle.v0.y);
                black_box(triangle.v0.z);
            }
        },

which is what the other tests are doing (accessing an entire vector), and, boom:

mesh/rkyv/read (unvalidated)
                        time:   [118.12 µs 118.20 µs 118.27 µs]

These ampersands are spread a bit here and there—there should be some automated way to check that all benchmarks are measuring the same thing.

djkoloski commented 3 months ago

I forgot to tell you—you have to look at the "epserde" branch.

I was, when I looked it was only one commit ahead. It's now three commits ahead, so I'll take another look.

Yes, but once again, "read" measure how long it takes... from where? Raw bytes? Fully deserialized object? Partially deserialized object?

It's "read" the fastest way your framework can. There's no use benchmarking reads from a fully deserialized object because all fully deserialized objects should have the same performance (as they're all the same types with the same properties). For most frameworks this means reading through a zero-copy view of the data. The particular method is not specified because nobody really cares how you read out the data, just that it's representative of how your framework actually behaves.

This test is scanning how fast you can enumerate pointers, not how fast you can access data.

One could argue that enumerating the pointers is what we want to measure. Reading a vertex out of a reference costs the same amount regardless of the framework, so why measure it? The purpose of the read benchmark is to highlight the different access strategies for different frameworks. A deserialization framework with a nontrivial access method for data should be penalized compared to a framework with a more trivial one. In terms of standards, this would mean that every framework should pass a reference to some data to a black_box as opposed to reading the value out and passing that.

That's just one perspective on the issue. I think a benchmark that doesn't read data shouldn't really be called read, so perhaps traverse would be better. If you'd like to push on this, I'd be happy to review a PR that standardizes the read benchmarks, and/or one that adds a traverse benchmark. As per the README:

These benchmarks are still being developed and pull requests to improve benchmarks are welcome.

vigna commented 3 months ago

It's "read" the fastest way your framework can. There's no use benchmarking reads from a fully deserialized object because all fully deserialized objects should have the same performance (as they're all the same types with the same properties).

That's not true. Have a look at the Zerovec documentation (that's one of the most popular zero-copy framework, but it's not included in your benchmarks).

That's just one perspective on the issue. I think a benchmark that doesn't read data shouldn't really be called read, so perhaps traverse would be better. If you'd like to push on this, I'd be happy to review a PR that standardizes the read benchmarks, and/or one that adds a traverse benchmark. As per the README:

Changing names to match actual behavior is definitely a way to go.

djkoloski commented 3 months ago

That's not true.

Please file a separate issue.

djkoloski commented 2 weeks ago

Closing this since help has been provided.