TeamAtomECS / AtomECS

Cold atom simulation code
GNU General Public License v3.0
46 stars 12 forks source link

Porting AtomECS v0.8.0 from the `specs` ECS backend to `bevy` #3

Open ElliotB256 opened 3 years ago

ElliotB256 commented 3 years ago

Performance is slightly better with Legion than specs (Amethyst has now dropped specs and is moving to Legion). Consider moving to Legion. https://github.com/amethyst/legion

ElliotB256 commented 3 years ago

Performance comparison and feature breakdown: https://amethyst.rs/posts/legion-ecs-v0.3

ElliotB256 commented 3 years ago

The Inner Data Parallelism looks much nicer in legion than specs

ElliotB256 commented 3 years ago

Bevy also looks pretty good - I'm going to keep an eye on it while the dust settles. We can probably move sometime in a few months.

ElliotB256 commented 3 years ago

@MauriceZeuner also might be interesting if you are bored. I'm particularly interested to see how each ECS crate compiles to SIMD instruction sets.

ElliotB256 commented 3 years ago

I benchmarked a few other libraries for a simple example here: https://github.com/ElliotB256/bevy_bench/issues/2

specs currently outperforms both legion and bevy for this example - which is a surprise to be honest!

ElliotB256 commented 3 years ago

Shelved for now. Can reopen in a few months time depending on benchmarks of Legion and bevy.

ElliotB256 commented 2 years ago

https://github.com/bevyengine/bevy/issues/2173

ElliotB256 commented 2 years ago

I actually started a bevy port now, you can see progress here: https://github.com/TeamAtomECS/AtomECS/tree/bevy

ElliotB256 commented 2 years ago

Port is going well, I've done most of the major modules.

Demos can be seen here (in your browser!) https://teamatomecs.github.io/AtomECSDemos/

ElliotB256 commented 2 years ago

@MauriceZeuner

ElliotB256 commented 2 years ago

Also tagging @YangChemE and @minghuaw , who both may be interested.

I'll finish porting code in that branch before making performance measurements. Some preliminary ones showed the bevy code running twice as fast as the specs code, which surprised me because my benchmarks earlier showed specs was faster. The bevy code is much more compact, it's a significantly nicer API. I am still keeping this as experimental for now. I expect bevy performance will improve as it has very active development.

ElliotB256 commented 2 years ago

https://teamatomecs.github.io/AtomECSDemos/aion_source.html

minghuaw commented 2 years ago

This is just a general comment. It seems like bevy has got its own set of shapes. I am wondering whether the port should switch to bevy's shape definitions or stick with the current ones.

ElliotB256 commented 2 years ago

It's a good point! I will check this when back from holiday. We would still want to define our own Shape trait, but possibly implement the trait on these primitives. We probably wouldn't want to define it for all shapes though; bevy's are rendering primitives and so aren't mathematically pure shapes, e.g. the icosphere versus a mathematically defined sphere. Thanks for the suggestion!

What do you think of bevy's syntax compared to specs? I still have yet to compare performance.

ElliotB256 commented 2 years ago

I made a comparison of some benchmark simulations.

Bevy currently performs about 37% slower than the specs branch. This is roughly in agreement with the results from https://github.com/bevyengine/bevy/issues/2173 for fine-grained systems, where bevy takes ~130us compared to 108us for specs. I expect the game will close as bevy development continues.

ElliotB256 commented 2 years ago

Bevy 0.8 came out a few days ago so I've ported the bevy branch to it. I see a slight performance improvement on the benchmark:

bevy 0.7

15361ms
15826ms
15414ms
avg=15533ms

bevy 0.8

13952ms
13660ms
13232ms
avg=13614ms

Time delta 87% / 14 % faster

ElliotB256 commented 1 year ago

Re-running bevy 0.8 on my home machine:

bevy 0.8 (10k atoms, 5k steps)

Simulation loop completed in 28085 ms.
Simulation loop completed in 28283 ms.
Simulation loop completed in 31367 ms.

29.2s average

bevy 0.9

Simulation loop completed in 30880 ms.
Simulation loop completed in 30450 ms.
Simulation loop completed in 35263 ms.

32.2s average

Bevy 0.9 10% longer duration than that of bevy 0.8

ElliotB256 commented 1 year ago

Retesting for bevy 0.10 comparisons

bevy 0.8

Simulation loop completed in 28476 ms.
Simulation loop completed in 28009 ms.
Simulation loop completed in 27892 ms.

bevy 0.9

Simulation loop completed in 33319 ms.
Simulation loop completed in 30575 ms.
Simulation loop completed in 30756 ms.

bevy 0.10

Simulation loop completed in 42307 ms.
Simulation loop completed in 40414 ms.
Simulation loop completed in 39600 ms.

Results

version time 1 time 2 time 3 avg comparison
0.8 28476 28009 27892 28126 +0%
0.9 33319 30575 30756 31550 +12%
0.10 42307 40414 39600 40773 +45%
ElliotB256 commented 1 year ago

Let's compare some rough metrics between bevy 0.8 and 0.10:

bevy 0.8

image

image

image

bevy 0.10

image

image

image

So it seems 0.10 has a core sitting idle on my machine!

ElliotB256 commented 1 year ago
    app.add_plugins(DefaultPlugins.set(TaskPoolPlugin {
        task_pool_options: TaskPoolOptions::with_num_threads(5),
    }));

The above prevents one of the cores being idled by the compute thread, so now 3/4 cores on my machine are working. It also reduces benchmark duration to 29627ms, which is closer to 0.8/0.9 performance.

ElliotB256 commented 1 year ago

After tweaking until I get 4 compute threads, I get:

Simulation loop completed in 23875 ms.

which is faster than bevy 0.8! (I suspect bevy 0.8 never used all cores properly...)

Running the master branch benchmark on my home pc (ie, using specs)

Simulation loop completed in 19532 ms.

So bevy 0.10 is very close to specs in performance, which is good!

ElliotB256 commented 1 year ago
One of the nice things in bevy 0.10 is the ability to automatically determine batch sized based on a heuristic. Let's see how it handles: test 1st 2nd 3rd avg change
fixed batch, size=1024 26611 24910 26738 26086 +0%
BatchingStrategy::new() 24023 23671 24034 23909 -8.4%
ElliotB256 commented 1 year ago

Need to make sure #79 is fixed on this branch - best way to do this is to first implement the test.

ElliotB256 commented 1 year ago

@minghuaw - I saw you recently exploring the rust gpu ecosystem a bit, would you be interested in a GPU-accelerated entity query system? I'm thinking it would dramatically speed up the single-atom systems. We could start a new issue to explore this - it's the next goal after bevy port is complete.

minghuaw commented 1 year ago

I am indeed looking at general purpose comouting on GPU recently. I haven't found much luck yet, but i'll keep this in mind and see what i can do about it.

ElliotB256 commented 1 year ago

The dream for me would be defining components with a #[GPUStorage] attribute, and then replacing Query with a kind of GPUQuery that provides a for-each that gets converted to a GPU kernel program via rust-gpu. I think it's probably not too much work, but I should use the scant time I have right now for getting bevy branch finished. Maybe next month :)

ElliotB256 commented 1 year ago

Bevy now up to 0.11, I'll get round to porting and benching soon. I was hoping for some days of free time this summer to finish the port but they haven't arrived yet!