Added planck_ecs. Added violin graphs.

AnneKitsune commented 3 years ago

This pull request adds violin images to the readme, so that manually editing the table doesn't need to be done manually anymore. Futher, it adds planck_ecs to the benchmarked ecs.

Systemcluster commented 3 years ago

The graphs are unreadable in GitHub dark mode - Could you add a background instead leaving it transparent?

AnneKitsune commented 3 years ago

I'm not totally sure if criterion has a config for this, but if I manage to find it I will!

Systemcluster commented 3 years ago

I created an issue here: https://github.com/bheisler/criterion.rs/issues/469

I don't think that should stop this PR from being merged, as that can always be changed later.

cart commented 3 years ago

I'm in favor of this change. The graphs help visualize the differences and automation is a huge plus. The biggest downside is the loss of "additional metadata" about "specialized" benchmarks. I think that sort of thing needs to be captured to make the comparisons meaningful and honest. Imo we should sort out a way to encode this information before removing the current system (which is already not good enough).

Ex:

Many frameworks/apis/tests support batched operations. Having both batched and unbatched tests is important because batching can't always be used. It doesn't make sense to compare a batched api to an unbatched api (without a big asterisk) as they can't always be used interchangeably.
Many frameworks have some sort of opt-in "optimized storage". Legion and Shipyard have "packs". Bevy will soon have both sparse and table storage. If we just pick the best storage for each test (without reporting it), that isn't honest.

cart commented 3 years ago

Eh I should have cloned the repo and looked at all of the graphs first. The "specialized" tests are displayed in the graphs.

I guess thats good enough for now. Imo we should include more variants and documentation for those variants for each test, but thats not something to block this pr on :smile:

AnneKitsune commented 3 years ago

I also agree that some benchmarks are very misleading. Things like legion's batch entity creation or world.extend allows for massive performance gains compared to other ECS that can't or choose not to support batch methods.

cart commented 3 years ago

This is a big enough change to the "result reporting" that we should get at least one more approval before merging. @TomGillen / @Ralith, can one or both of you take a look?

Ralith commented 3 years ago

I'll try to have a look at this tomorrow; thanks for your work so far!

Ralith commented 3 years ago

I also agree that some benchmarks are very misleading.

For the sake of discussion, I (from an obviously biased perspective) kinda disagree with this. The README gives a clear specification of the benchmarks, and it's one where batched operations are applicable. If an ECS chooses not to optimize for that case, that's fine, but it's still an important, realistic case, at least as far as anyone's taken it so far--nobody should be getting a free pass for sacrificing it. If unbatchable workloads are underrepresented, that's a different problem.

cart commented 3 years ago

For the sake of discussion, I (from an obviously biased perspective) kinda disagree with this. The README gives a clear specification of the benchmarks, and it's one where batched operations are applicable. If an ECS chooses not to optimize for that case, that's fine, but it's still an important, realistic case, at least as far as anyone's taken it so far--nobody should be getting a free pass for sacrificing it. If unbatchable workloads are underrepresented, that's a different problem.

Imo the biggest problem is that I could easily implement and use "optimized" storages/algorithms for almost any benchmark, but the code could end up not looking like "idiomatic" bevy/legion/hecs/shipyard/etc code. Ex: the legion "simple insert" benchmark uses SoA insertion, but you won't find that in their readme/examples. SoA requires the user to "decouple" each entity's components across component-specific arrays, which I wouldn't recommend to anyone (and which I would prefer to not implement in Bevy at all). But now I feel "pressured" to implement it otherwise I can't "compete".

If Bevy starts using "sparse set" components in its add/remove benchmark (which isn't what I would recommend by default for Bevy users), then the only way legion can compete is to implement sparse set storage (which they might not want to do for design reasons).

I consider those two cases to be almost identical. The behavior they encourage for ECS authors and the picture the results paint for users feel "wrong" to me.

I don't think ecs_bench_suite should encourage ECS authors to "play the benchmark game" with hyper-optimized per-benchmark results. I think it should encourage "real world" comparisons of ECS frameworks with "idiomatic" api usage. Anything outside of those bounds (or anything that might cause users to draw the wrong conclusions, such as comparing batched/unbatched benchmarks) should be clearly indicated.

Ex: I think we should have separate batched/unbatched insertion benchmarks. That clearly indicates which frameworks support batched insertion. Ideally we also have AoS and SoA labels. I also think Bevy shouldn't be allowed to silently use sparse set storage for add/removes. Instead we should provide both results (and probably indicate that Table storage is the default / recommended storage type).

Ralith commented 3 years ago

Oh, I didn't realize legion was getting columnar data from the user in that case; that's more egregious than batching row-oriented operations, I agree. Hecs has such an API as well, but I don't consider it appropriate for that benchmark.

I absolutely agree that we should focus on benchmarking realistic patterns rather than gamesmanship, and it looks like we've already slipped into that more than I realized.

AnneKitsune commented 3 years ago

@Ralith the two missing plots are added

AnneKitsune commented 3 years ago

Oh, it doesn't generate violin graphs since only legion implemented those benchmarks. Woops! I guess it doesn't mean much to have benchmarks only for one ECS. What should we do?

Ralith commented 3 years ago

I went ahead and merged the hecs support for that benchmark; merge master and it should be good to go.

AnneKitsune commented 3 years ago

Completed :+1:

Ralith commented 3 years ago

Great, thanks! This is a big improvement.

rust-gamedev / ecs_bench_suite

Added planck_ecs. Added violin graphs. #18