aws / lumberyard

Amazon Lumberyard is a free AAA game engine deeply integrated with AWS and Twitch – with full source.
Other
2.04k stars 541 forks source link

Statistics collection and thread safety fixes for AZ::IO::DedicatedCache #494

Closed dkondrashkin closed 4 years ago

dkondrashkin commented 4 years ago

This commit addresses 2 problems we've encountered:

  1. Broken RAD telemetry plots (collected statistics) for AZ::IO::Streamer;
  2. Rare crashes of game client.

For the first issue we have a following visualization:

image

Streamer stat names appeared to be corrupted. This happened because of Statistic::CreateXXX(...) functions, accepting AZStd::string_view as a first argument, were supplied withAZStd::string variables living in method scope. To fix this we've introduced a separate cache for stat names.

Second issue (crash) was preceded by the following assertion:

<2020-05-29T18:22:20:529+03> (System) - Trace::Assert ...\Code\Framework\AzCore\AzCore/std/containers/vector.h(585): (14664) 'const class std::unique_ptr<class AZ::IO::BlockCache,struct std::default_delete<class AZ::IO::BlockCache> > &__cdecl AZStd::vector<class std::unique_ptr<class AZ::IO::BlockCache,struct std::default_delete<class AZ::IO::BlockCache> >,class AZStd::allocator>::operator [](unsigned __int64) const'
<2020-05-29T18:22:20:529+03> (System) - AZStd::vector<>::at - position is out of range
<2020-05-29T18:22:20:530+03> (System) - ------------------------------------------------
<2020-05-29T18:23:37:320+03> (System) - 00007FF7B9793A27 (game01Launcher) : AZ::IO::FullFileDecompressor::CollectStatistics
<2020-05-29T18:23:37:321+03> (System) - 00007FFC2188AEE9 (CryRenderD3D11) : AZ::IO::Device::CollectStatistics
<2020-05-29T18:23:37:321+03> (System) - 00007FFC21890D54 (CryRenderD3D11) : AZ::IO::Device::OnTick
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B968B787 (game01Launcher) : AZ::Internal::EBusContainer<AZ::TickEvents,AZ::TickEvents,0,2>::Dispatcher<AZ::EBus<AZ::TickEvents,AZ::TickEvents> >::Broadcast<void (__cdecl AZ::TickEvents::*)(float,AZ::ScriptTimePoint) __ptr64,float & __ptr64,AZ::Sc
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B96E57C7 (game01Launcher) : AZ::ComponentApplication::Tick
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B9178292 (game01Launcher) : LumberyardLauncher::Run
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B9179A33 (game01Launcher) : WinMain
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B9AC2FCE (game01Launcher) : __scrt_common_main_seh
<2020-05-29T18:23:37:321+03> (System) - 00007FFCBE897974 (KERNEL32) : BaseThreadInitThunk
<2020-05-29T18:23:37:321+03> (System) - 00007FFCBFBCA271 (ntdll) : RtlUserThreadStart

This can happen due to the fact that DedicatedCache::CollectStatistics() and DedicatedCache::DestroyDedicatedCache() methods can be called from different threads (main and streamer threads respectively). While this happens rarely, access to DedicatedCache internal structures should be protected with sync constructs.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

AMZN-nggieber commented 4 years ago

Thank you for the pull request!

AMZN-alexpete commented 4 years ago

@dkondrashkin thank you for submitting this fix! We've integrated the change and it will be available in a future version of Lumberyard.