Closed afonsolage closed 2 years ago
The work is mostly done. The only downside of this change is the ergonomics of querying chunks. Due the async nature, mixed with per-frame-system, this adds some complexity.
I'll do more tests to see the performance and check if is possible to add some macros to have better ergonomics.
My only concern right is because current implementation is copying ChunkStorage
whenever someone does a query. The default buffer size of ChunkStorage
is 4096, since I'm using AXIS_SIZE
of 16, but if for some reason I need to increase to 32, the ChunkStorage
size will increase to 32768, which is high to be copying around.
Maybe it's time to check Arc
and send Weak
instead of copying?
Need to improve update_landscape_system
Sep 26 16:30:43.939 INFO projekto::debug::perf: Performance Counter:
update_landscape_system avg: 55011μs, samples: 21, min: 22848μs, max: 75650μs, meta: 7μs
load_chunk avg: 4033μs, samples: 1600, min: 3338μs, max: 37873μs, meta: 0μs
process_batch avg: 2954μs, samples: 2200, min: 15μs, max: 37921μs, meta: 331μs
update_chunk avg: 274μs, samples: 2579, min: 1μs, max: 527μs, meta: 0μs
unload_chunk avg: 9μs, samples: 600, min: 8μs, max: 36μs, meta: 0μs
spawn_chunks_system avg: 8μs, samples: 1600, min: 6μs, max: 355μs, meta: 0μs
mesh_generation_system avg: 7μs, samples: 4000, min: 4μs, max: 357μs, meta: 1μs
update_world_system avg: 5μs, samples: 810, min: 2μs, max: 200μs, meta: 1μs
update_chunks_system avg: 1μs, samples: 8, min: 1μs, max: 3μs, meta: 227μs
despawn_chunks_system avg: 1μs, samples: 427, min: 1μs, max: 13μs, meta: 0μs
Sep 26 16:31:22.061 INFO projekto::debug::perf: Performance Counter:
update_landscape_system avg: 54897μs, samples: 48, min: 22848μs, max: 81828μs, meta: 7μs
load_chunk avg: 6181μs, samples: 2200, min: 3338μs, max: 37873μs, meta: 0μs
process_batch avg: 4022μs, samples: 3400, min: 15μs, max: 37921μs, meta: 382μs
update_chunk avg: 266μs, samples: 4734, min: 1μs, max: 561μs, meta: 0μs
unload_chunk avg: 9μs, samples: 1200, min: 8μs, max: 49μs, meta: 0μs
spawn_chunks_system avg: 9μs, samples: 2200, min: 6μs, max: 355μs, meta: 0μs
mesh_generation_system avg: 8μs, samples: 6400, min: 4μs, max: 357μs, meta: 2μs
update_world_system avg: 6μs, samples: 1197, min: 2μs, max: 291μs, meta: 1μs
update_chunks_system avg: 1μs, samples: 20, min: 1μs, max: 3μs, meta: 174μs
despawn_chunks_system avg: 1μs, samples: 784, min: 1μs, max: 22μs, meta: 1μs
load_chunk
needs to be improved also, but this probably will be achieved by switching from ron
format to some binary format.
The performance is better now, but there is still room to improvements
load_chunk avg: 3738μs, samples: 2200, min: 3301μs, max: 6916μs, meta: 0μs
update_landscape_system avg: 3445μs, samples: 52, min: 2865μs, max: 6108μs, meta: 5μs
process_batch avg: 2439μs, samples: 3400, min: 15μs, max: 6941μs, meta: 387μs
update_chunk avg: 261μs, samples: 4893, min: 1μs, max: 818μs, meta: 0μs
unload_chunk avg: 10μs, samples: 1200, min: 8μs, max: 47μs, meta: 0μs
spawn_chunks_system avg: 8μs, samples: 2200, min: 6μs, max: 353μs, meta: 0μs
mesh_generation_system avg: 7μs, samples: 6800, min: 4μs, max: 594μs, meta: 1μs
update_world_system avg: 6μs, samples: 1002, min: 2μs, max: 232μs, meta: 1μs
update_chunks_system avg: 2μs, samples: 20, min: 1μs, max: 7μs, meta: 162μs
despawn_chunks_system avg: 1μs, samples: 511, min: 1μs, max: 37μs, meta: 2μs
Tested with bincode
but not notable difference was found on performance. So I think it's time to move on
Ron
file serialization (save_cache
)
Sep 26 17:39:48.093 INFO projekto::debug::perf: Performance Counter:
process_batch avg: 25777μs, samples: 1000, min: 23092μs, max: 35143μs, meta: 319μs
load_chunk avg: 25757μs, samples: 1000, min: 23073μs, max: 35122μs, meta: 0μs
generate_cache avg: 25702μs, samples: 1000, min: 23022μs, max: 35065μs, meta: 0μs
save_cache avg: 25048μs, samples: 1000, min: 22463μs, max: 34400μs, meta: 0μs
faces_merging avg: 16429μs, samples: 1006, min: 14186μs, max: 20468μs, meta: 0μs
faces_occlusion avg: 3897μs, samples: 1006, min: 1221μs, max: 8380μs, meta: 0μs
update_landscape_system avg: 3443μs, samples: 9, min: 2834μs, max: 5051μs, meta: 5μs
update_chunk avg: 285μs, samples: 1088, min: 1μs, max: 604μs, meta: 0μs
vertices_computation avg: 16μs, samples: 654, min: 1μs, max: 190μs, meta: 0μs
mesh_generation_system avg: 10μs, samples: 2000, min: 5μs, max: 355μs, meta: 1μs
spawn_chunks_system avg: 7μs, samples: 1000, min: 6μs, max: 336μs, meta: 0μs
update_world_system avg: 4μs, samples: 2923, min: 2μs, max: 254μs, meta: 1μs
update_chunks_system avg: 3μs, samples: 1, min: 3μs, max: 3μs, meta: 664μs
bincode
file serialization (save_cache
)
Sep 26 17:41:06.910 INFO projekto::debug::perf: Performance Counter:
faces_merging avg: 16467μs, samples: 1004, min: 14151μs, max: 27723μs, meta: 0μs
process_batch avg: 7344μs, samples: 1000, min: 6104μs, max: 10334μs, meta: 320μs
load_chunk avg: 7322μs, samples: 1000, min: 6086μs, max: 10309μs, meta: 0μs
generate_cache avg: 7263μs, samples: 1000, min: 6041μs, max: 10211μs, meta: 0μs
save_cache avg: 6559μs, samples: 1000, min: 5512μs, max: 9199μs, meta: 0μs
faces_occlusion avg: 4002μs, samples: 1004, min: 1247μs, max: 10803μs, meta: 0μs
update_landscape_system avg: 3764μs, samples: 14, min: 2837μs, max: 6764μs, meta: 6μs
update_chunk avg: 287μs, samples: 1083, min: 1μs, max: 536μs, meta: 0μs
vertices_computation avg: 16μs, samples: 651, min: 1μs, max: 171μs, meta: 0μs
mesh_generation_system avg: 10μs, samples: 2000, min: 4μs, max: 418μs, meta: 1μs
spawn_chunks_system avg: 7μs, samples: 1000, min: 6μs, max: 338μs, meta: 0μs
update_world_system avg: 6μs, samples: 769, min: 2μs, max: 232μs, meta: 2μs
update_chunks_system avg: 3μs, samples: 1, min: 3μs, max: 3μs, meta: 634μs
Actually, there is a big difference on file serialization using bincode
, but on deserialization, the difference is quite low:
Ron
file deserialization (load_cache
):
Sep 26 17:44:46.482 INFO projekto::debug::perf: Performance Counter:
faces_merging avg: 16363μs, samples: 1005, min: 14059μs, max: 24920μs, meta: 0μs
faces_occlusion avg: 3941μs, samples: 1005, min: 1247μs, max: 9515μs, meta: 0μs
process_batch avg: 3812μs, samples: 1000, min: 3399μs, max: 6182μs, meta: 303μs
load_chunk avg: 3797μs, samples: 1000, min: 3386μs, max: 6158μs, meta: 0μs
load_cache avg: 3751μs, samples: 1000, min: 3351μs, max: 5014μs, meta: 0μs
update_landscape_system avg: 3095μs, samples: 11, min: 2875μs, max: 3521μs, meta: 19μs
update_chunk avg: 293μs, samples: 1010, min: 1μs, max: 513μs, meta: 0μs
vertices_computation avg: 16μs, samples: 682, min: 1μs, max: 151μs, meta: 0μs
mesh_generation_system avg: 8μs, samples: 2000, min: 5μs, max: 354μs, meta: 1μs
spawn_chunks_system avg: 7μs, samples: 1000, min: 6μs, max: 336μs, meta: 0μs
update_world_system avg: 4μs, samples: 475, min: 2μs, max: 236μs, meta: 1μs
update_chunks_system avg: 3μs, samples: 1, min: 3μs, max: 3μs, meta: 652μs
bincode
file deserialization (load_cache
):
Sep 26 17:42:42.367 INFO projekto::debug::perf: Performance Counter:
faces_merging avg: 16805μs, samples: 1005, min: 14458μs, max: 26231μs, meta: 0μs
faces_occlusion avg: 3966μs, samples: 1005, min: 1247μs, max: 9216μs, meta: 0μs
update_landscape_system avg: 3663μs, samples: 10, min: 2934μs, max: 5012μs, meta: 5μs
process_batch avg: 3304μs, samples: 1000, min: 3071μs, max: 5567μs, meta: 302μs
load_chunk avg: 3289μs, samples: 1000, min: 3034μs, max: 5545μs, meta: 0μs
load_cache avg: 3246μs, samples: 1000, min: 2982μs, max: 4247μs, meta: 0μs
update_chunk avg: 286μs, samples: 1029, min: 1μs, max: 450μs, meta: 0μs
vertices_computation avg: 15μs, samples: 707, min: 1μs, max: 182μs, meta: 0μs
mesh_generation_system avg: 10μs, samples: 2000, min: 5μs, max: 353μs, meta: 1μs
spawn_chunks_system avg: 6μs, samples: 1000, min: 6μs, max: 336μs, meta: 0μs
update_world_system avg: 4μs, samples: 437, min: 2μs, max: 235μs, meta: 1μs
update_chunks_system avg: 3μs, samples: 1, min: 3μs, max: 3μs, meta: 639μs
I'll stick with bincode
anyways
I think it's pretty good for now:
Sep 26 21:29:31.534 INFO projekto::debug::perf: Performance Counter:
faces_merging avg: 14245μs, samples: 413, min: 3548μs, max: 25204μs, meta: 0μs
faces_occlusion avg: 3540μs, samples: 1069, min: 492μs, max: 11657μs, meta: 0μs
update_landscape_system avg: 3498μs, samples: 11, min: 3042μs, max: 5112μs, meta: 9μs
process_batch avg: 3339μs, samples: 1000, min: 3140μs, max: 5343μs, meta: 308μs
load_chunk avg: 3323μs, samples: 1000, min: 3127μs, max: 5324μs, meta: 0μs
load_cache avg: 3276μs, samples: 1000, min: 3095μs, max: 5046μs, meta: 0μs
update_chunk avg: 281μs, samples: 1066, min: 1μs, max: 436μs, meta: 0μs
vertices_computation avg: 26μs, samples: 413, min: 3μs, max: 148μs, meta: 0μs
spawn_chunks_system avg: 20μs, samples: 1000, min: 14μs, max: 362μs, meta: 41μs
mesh_generation_system avg: 20μs, samples: 2000, min: 10μs, max: 608μs, meta: 104μs
merge_faces avg: 19μs, samples: 14773, min: 1μs, max: 1161μs, meta: 378μs
update_chunks_system avg: 4μs, samples: 6, min: 1μs, max: 12μs, meta: 29542μs
update_world_system avg: 4μs, samples: 409, min: 2μs, max: 246μs, meta: 1μs
this is on dev mode, when running on release mode:
Sep 26 21:30:20.052 INFO projekto::debug::perf: Performance Counter:
process_batch avg: 2388μs, samples: 1000, min: 2254μs, max: 4038μs, meta: 8μs
load_chunk avg: 2387μs, samples: 1000, min: 2253μs, max: 4036μs, meta: 0μs
load_cache avg: 2377μs, samples: 1000, min: 2249μs, max: 3595μs, meta: 0μs
faces_merging avg: 620μs, samples: 439, min: 100μs, max: 1303μs, meta: 0μs
update_landscape_system avg: 110μs, samples: 2, min: 108μs, max: 113μs, meta: 1μs
faces_occlusion avg: 59μs, samples: 1134, min: 8μs, max: 216μs, meta: 0μs
update_chunk avg: 7μs, samples: 1000, min: 4μs, max: 21μs, meta: 0μs
merge_faces avg: 7μs, samples: 1378, min: 1μs, max: 48μs, meta: 185μs
mesh_generation_system avg: 7μs, samples: 2000, min: 1μs, max: 34μs, meta: 0μs
vertices_computation avg: 4μs, samples: 160, min: 1μs, max: 15μs, meta: 0μs
spawn_chunks_system avg: 1μs, samples: 114, min: 1μs, max: 9μs, meta: 7μs
update_chunks_system avg: 1μs, samples: 1, min: 1μs, max: 1μs, meta: 63μs
update_world_system avg: 1μs, samples: 87, min: 1μs, max: 24μs, meta: 0μs
There is still room for improvement, like there is a lot of chunk updates that can be avoided IMO, but needs further investigation
Reworking how world is generated, loaded and updated using
bevy_task
closes #5 fixes #3