afonsolage / projekto

Voxel game made with Bevy Engine
MIT License
59 stars 4 forks source link

World rework #6

Closed afonsolage closed 2 years ago

afonsolage commented 2 years ago

Reworking how world is generated, loaded and updated using bevy_task

closes #5 fixes #3

afonsolage commented 2 years ago

The work is mostly done. The only downside of this change is the ergonomics of querying chunks. Due the async nature, mixed with per-frame-system, this adds some complexity.

I'll do more tests to see the performance and check if is possible to add some macros to have better ergonomics.

My only concern right is because current implementation is copying ChunkStorage whenever someone does a query. The default buffer size of ChunkStorage is 4096, since I'm using AXIS_SIZE of 16, but if for some reason I need to increase to 32, the ChunkStorage size will increase to 32768, which is high to be copying around.

Maybe it's time to check Arc and send Weak instead of copying?

afonsolage commented 2 years ago

Need to improve update_landscape_system

Sep 26 16:30:43.939  INFO projekto::debug::perf: Performance Counter: 
update_landscape_system        avg: 55011μs, samples:    21, min: 22848μs, max: 75650μs, meta:     7μs
load_chunk                     avg:  4033μs, samples:  1600, min:  3338μs, max: 37873μs, meta:     0μs
process_batch                  avg:  2954μs, samples:  2200, min:    15μs, max: 37921μs, meta:   331μs
update_chunk                   avg:   274μs, samples:  2579, min:     1μs, max:   527μs, meta:     0μs
unload_chunk                   avg:     9μs, samples:   600, min:     8μs, max:    36μs, meta:     0μs
spawn_chunks_system            avg:     8μs, samples:  1600, min:     6μs, max:   355μs, meta:     0μs
mesh_generation_system         avg:     7μs, samples:  4000, min:     4μs, max:   357μs, meta:     1μs
update_world_system            avg:     5μs, samples:   810, min:     2μs, max:   200μs, meta:     1μs
update_chunks_system           avg:     1μs, samples:     8, min:     1μs, max:     3μs, meta:   227μs
despawn_chunks_system          avg:     1μs, samples:   427, min:     1μs, max:    13μs, meta:     0μs
Sep 26 16:31:22.061  INFO projekto::debug::perf: Performance Counter: 
update_landscape_system        avg: 54897μs, samples:    48, min: 22848μs, max: 81828μs, meta:     7μs
load_chunk                     avg:  6181μs, samples:  2200, min:  3338μs, max: 37873μs, meta:     0μs
process_batch                  avg:  4022μs, samples:  3400, min:    15μs, max: 37921μs, meta:   382μs
update_chunk                   avg:   266μs, samples:  4734, min:     1μs, max:   561μs, meta:     0μs
unload_chunk                   avg:     9μs, samples:  1200, min:     8μs, max:    49μs, meta:     0μs
spawn_chunks_system            avg:     9μs, samples:  2200, min:     6μs, max:   355μs, meta:     0μs
mesh_generation_system         avg:     8μs, samples:  6400, min:     4μs, max:   357μs, meta:     2μs
update_world_system            avg:     6μs, samples:  1197, min:     2μs, max:   291μs, meta:     1μs
update_chunks_system           avg:     1μs, samples:    20, min:     1μs, max:     3μs, meta:   174μs
despawn_chunks_system          avg:     1μs, samples:   784, min:     1μs, max:    22μs, meta:     1μs

load_chunk needs to be improved also, but this probably will be achieved by switching from ron format to some binary format.

afonsolage commented 2 years ago

The performance is better now, but there is still room to improvements

load_chunk                     avg:  3738μs, samples:  2200, min:  3301μs, max:  6916μs, meta:     0μs
update_landscape_system        avg:  3445μs, samples:    52, min:  2865μs, max:  6108μs, meta:     5μs
process_batch                  avg:  2439μs, samples:  3400, min:    15μs, max:  6941μs, meta:   387μs
update_chunk                   avg:   261μs, samples:  4893, min:     1μs, max:   818μs, meta:     0μs
unload_chunk                   avg:    10μs, samples:  1200, min:     8μs, max:    47μs, meta:     0μs
spawn_chunks_system            avg:     8μs, samples:  2200, min:     6μs, max:   353μs, meta:     0μs
mesh_generation_system         avg:     7μs, samples:  6800, min:     4μs, max:   594μs, meta:     1μs
update_world_system            avg:     6μs, samples:  1002, min:     2μs, max:   232μs, meta:     1μs
update_chunks_system           avg:     2μs, samples:    20, min:     1μs, max:     7μs, meta:   162μs
despawn_chunks_system          avg:     1μs, samples:   511, min:     1μs, max:    37μs, meta:     2μs
afonsolage commented 2 years ago

Tested with bincode but not notable difference was found on performance. So I think it's time to move on

afonsolage commented 2 years ago

Ron file serialization (save_cache)

Sep 26 17:39:48.093  INFO projekto::debug::perf: Performance Counter: 
process_batch                  avg: 25777μs, samples:  1000, min: 23092μs, max: 35143μs, meta:   319μs
load_chunk                     avg: 25757μs, samples:  1000, min: 23073μs, max: 35122μs, meta:     0μs
generate_cache                 avg: 25702μs, samples:  1000, min: 23022μs, max: 35065μs, meta:     0μs
save_cache                     avg: 25048μs, samples:  1000, min: 22463μs, max: 34400μs, meta:     0μs
faces_merging                  avg: 16429μs, samples:  1006, min: 14186μs, max: 20468μs, meta:     0μs
faces_occlusion                avg:  3897μs, samples:  1006, min:  1221μs, max:  8380μs, meta:     0μs
update_landscape_system        avg:  3443μs, samples:     9, min:  2834μs, max:  5051μs, meta:     5μs
update_chunk                   avg:   285μs, samples:  1088, min:     1μs, max:   604μs, meta:     0μs
vertices_computation           avg:    16μs, samples:   654, min:     1μs, max:   190μs, meta:     0μs
mesh_generation_system         avg:    10μs, samples:  2000, min:     5μs, max:   355μs, meta:     1μs
spawn_chunks_system            avg:     7μs, samples:  1000, min:     6μs, max:   336μs, meta:     0μs
update_world_system            avg:     4μs, samples:  2923, min:     2μs, max:   254μs, meta:     1μs
update_chunks_system           avg:     3μs, samples:     1, min:     3μs, max:     3μs, meta:   664μs

bincode file serialization (save_cache)

Sep 26 17:41:06.910  INFO projekto::debug::perf: Performance Counter: 
faces_merging                  avg: 16467μs, samples:  1004, min: 14151μs, max: 27723μs, meta:     0μs
process_batch                  avg:  7344μs, samples:  1000, min:  6104μs, max: 10334μs, meta:   320μs
load_chunk                     avg:  7322μs, samples:  1000, min:  6086μs, max: 10309μs, meta:     0μs
generate_cache                 avg:  7263μs, samples:  1000, min:  6041μs, max: 10211μs, meta:     0μs
save_cache                     avg:  6559μs, samples:  1000, min:  5512μs, max:  9199μs, meta:     0μs
faces_occlusion                avg:  4002μs, samples:  1004, min:  1247μs, max: 10803μs, meta:     0μs
update_landscape_system        avg:  3764μs, samples:    14, min:  2837μs, max:  6764μs, meta:     6μs
update_chunk                   avg:   287μs, samples:  1083, min:     1μs, max:   536μs, meta:     0μs
vertices_computation           avg:    16μs, samples:   651, min:     1μs, max:   171μs, meta:     0μs
mesh_generation_system         avg:    10μs, samples:  2000, min:     4μs, max:   418μs, meta:     1μs
spawn_chunks_system            avg:     7μs, samples:  1000, min:     6μs, max:   338μs, meta:     0μs
update_world_system            avg:     6μs, samples:   769, min:     2μs, max:   232μs, meta:     2μs
update_chunks_system           avg:     3μs, samples:     1, min:     3μs, max:     3μs, meta:   634μs

Actually, there is a big difference on file serialization using bincode, but on deserialization, the difference is quite low:

Ron file deserialization (load_cache):

Sep 26 17:44:46.482  INFO projekto::debug::perf: Performance Counter: 
faces_merging                  avg: 16363μs, samples:  1005, min: 14059μs, max: 24920μs, meta:     0μs
faces_occlusion                avg:  3941μs, samples:  1005, min:  1247μs, max:  9515μs, meta:     0μs
process_batch                  avg:  3812μs, samples:  1000, min:  3399μs, max:  6182μs, meta:   303μs
load_chunk                     avg:  3797μs, samples:  1000, min:  3386μs, max:  6158μs, meta:     0μs
load_cache                     avg:  3751μs, samples:  1000, min:  3351μs, max:  5014μs, meta:     0μs
update_landscape_system        avg:  3095μs, samples:    11, min:  2875μs, max:  3521μs, meta:    19μs
update_chunk                   avg:   293μs, samples:  1010, min:     1μs, max:   513μs, meta:     0μs
vertices_computation           avg:    16μs, samples:   682, min:     1μs, max:   151μs, meta:     0μs
mesh_generation_system         avg:     8μs, samples:  2000, min:     5μs, max:   354μs, meta:     1μs
spawn_chunks_system            avg:     7μs, samples:  1000, min:     6μs, max:   336μs, meta:     0μs
update_world_system            avg:     4μs, samples:   475, min:     2μs, max:   236μs, meta:     1μs
update_chunks_system           avg:     3μs, samples:     1, min:     3μs, max:     3μs, meta:   652μs

bincode file deserialization (load_cache):

Sep 26 17:42:42.367  INFO projekto::debug::perf: Performance Counter: 
faces_merging                  avg: 16805μs, samples:  1005, min: 14458μs, max: 26231μs, meta:     0μs
faces_occlusion                avg:  3966μs, samples:  1005, min:  1247μs, max:  9216μs, meta:     0μs
update_landscape_system        avg:  3663μs, samples:    10, min:  2934μs, max:  5012μs, meta:     5μs
process_batch                  avg:  3304μs, samples:  1000, min:  3071μs, max:  5567μs, meta:   302μs
load_chunk                     avg:  3289μs, samples:  1000, min:  3034μs, max:  5545μs, meta:     0μs
load_cache                     avg:  3246μs, samples:  1000, min:  2982μs, max:  4247μs, meta:     0μs
update_chunk                   avg:   286μs, samples:  1029, min:     1μs, max:   450μs, meta:     0μs
vertices_computation           avg:    15μs, samples:   707, min:     1μs, max:   182μs, meta:     0μs
mesh_generation_system         avg:    10μs, samples:  2000, min:     5μs, max:   353μs, meta:     1μs
spawn_chunks_system            avg:     6μs, samples:  1000, min:     6μs, max:   336μs, meta:     0μs
update_world_system            avg:     4μs, samples:   437, min:     2μs, max:   235μs, meta:     1μs
update_chunks_system           avg:     3μs, samples:     1, min:     3μs, max:     3μs, meta:   639μs

I'll stick with bincode anyways

afonsolage commented 2 years ago

I think it's pretty good for now:

Sep 26 21:29:31.534  INFO projekto::debug::perf: Performance Counter: 
faces_merging                  avg: 14245μs, samples:   413, min:  3548μs, max: 25204μs, meta:     0μs
faces_occlusion                avg:  3540μs, samples:  1069, min:   492μs, max: 11657μs, meta:     0μs
update_landscape_system        avg:  3498μs, samples:    11, min:  3042μs, max:  5112μs, meta:     9μs
process_batch                  avg:  3339μs, samples:  1000, min:  3140μs, max:  5343μs, meta:   308μs
load_chunk                     avg:  3323μs, samples:  1000, min:  3127μs, max:  5324μs, meta:     0μs
load_cache                     avg:  3276μs, samples:  1000, min:  3095μs, max:  5046μs, meta:     0μs
update_chunk                   avg:   281μs, samples:  1066, min:     1μs, max:   436μs, meta:     0μs
vertices_computation           avg:    26μs, samples:   413, min:     3μs, max:   148μs, meta:     0μs
spawn_chunks_system            avg:    20μs, samples:  1000, min:    14μs, max:   362μs, meta:    41μs
mesh_generation_system         avg:    20μs, samples:  2000, min:    10μs, max:   608μs, meta:   104μs
merge_faces                    avg:    19μs, samples: 14773, min:     1μs, max:  1161μs, meta:   378μs
update_chunks_system           avg:     4μs, samples:     6, min:     1μs, max:    12μs, meta: 29542μs
update_world_system            avg:     4μs, samples:   409, min:     2μs, max:   246μs, meta:     1μs

this is on dev mode, when running on release mode:

Sep 26 21:30:20.052  INFO projekto::debug::perf: Performance Counter: 
process_batch                  avg:  2388μs, samples:  1000, min:  2254μs, max:  4038μs, meta:     8μs
load_chunk                     avg:  2387μs, samples:  1000, min:  2253μs, max:  4036μs, meta:     0μs
load_cache                     avg:  2377μs, samples:  1000, min:  2249μs, max:  3595μs, meta:     0μs
faces_merging                  avg:   620μs, samples:   439, min:   100μs, max:  1303μs, meta:     0μs
update_landscape_system        avg:   110μs, samples:     2, min:   108μs, max:   113μs, meta:     1μs
faces_occlusion                avg:    59μs, samples:  1134, min:     8μs, max:   216μs, meta:     0μs
update_chunk                   avg:     7μs, samples:  1000, min:     4μs, max:    21μs, meta:     0μs
merge_faces                    avg:     7μs, samples:  1378, min:     1μs, max:    48μs, meta:   185μs
mesh_generation_system         avg:     7μs, samples:  2000, min:     1μs, max:    34μs, meta:     0μs
vertices_computation           avg:     4μs, samples:   160, min:     1μs, max:    15μs, meta:     0μs
spawn_chunks_system            avg:     1μs, samples:   114, min:     1μs, max:     9μs, meta:     7μs
update_chunks_system           avg:     1μs, samples:     1, min:     1μs, max:     1μs, meta:    63μs
update_world_system            avg:     1μs, samples:    87, min:     1μs, max:    24μs, meta:     0μs

There is still room for improvement, like there is a lot of chunk updates that can be avoided IMO, but needs further investigation