wolfpld / tracy

Frame profiler
https://tracy.nereid.pl/
Other
8.61k stars 592 forks source link

Fix processing FrameData with no frames #798

Closed FedyuninV closed 1 month ago

FedyuninV commented 1 month ago

I'm trying to solve the issue #789

The #666 partially fixed the problem by allowing us to not use DeferItem at all. But it causes some stability issues: ignoring of unpaired MarkEnd leaves FrameData object with empty frames. But the server processing is built around the fact, that FrameData::frames are guaranteed to be filled with something (e.g. Worker::GetFrameRange goes out-of-bounds if FrameData::frames is empty). We've faced this while doing the following:

  1. Compile project with TRACY_ON_DEMAND
  2. Launch tracy-profiler and leave it waiting for 127.0.0.1
  3. Launch profiled project
  4. tracy-profiler crashes

This PR tries to close all possibly unsafe places (GetFrameBegin/End/Range shouldn't be called when FrameData::frames is empty). The usual stacktrace locally is View::DrawTimelineFrames( *fd ) -> Worker::GetFrameRange( *fd ) -> out-of-bounds on zitbegin--

The small synthetic example which crashes the master but seems stable with my fix. It should be placed in examples/test_delay. CMakeLists.txt

cmake_minimum_required(VERSION 3.17)
set(CMAKE_CXX_STANDARD 17)
project(test_delay)
add_executable(${PROJECT_NAME} main.cpp ../../public/TracyClient.cpp)
target_include_directories(${PROJECT_NAME} PUBLIC ../../public/tracy)
target_compile_definitions(${PROJECT_NAME} PUBLIC TRACY_ENABLE TRACY_ON_DEMAND)

main.cpp

#include <iostream>
#include <thread>
#include <chrono>

#include <../../public/tracy/Tracy.hpp>

constexpr const int N_RUNS=3;

constexpr static const char* runNames[N_RUNS] = {
    "Run0",
    "Run1",
    "Run2"
};

int main(int, char**)
{
    for (int i = 0; i < N_RUNS; ++i)
    {
        std::cout << "Run #" << i << " started" << std::endl;
        FrameMarkStart(runNames[i]);
        std::this_thread::sleep_for(std::chrono::seconds(5));
        FrameMarkEnd(runNames[i]);
        std::cout << "Run #" << i << " finished" << std::endl;
    }

    return 0;

P.S. I thought about a way to not let FrameData like these into processing in the first place (filtering them somehow) but didn't find a proper place in the code.

wolfpld commented 1 month ago

Please see if 216a4b766 and afe5c8dfb fixes things for you. These should mend the invalid behavior introduced by #666. Empty FrameData objects will no longer be created, which removes the need for your changes.

FedyuninV commented 1 month ago

Please see if 216a4b7 and afe5c8d fixes things for you. These should mend the invalid behavior introduced by #666. Empty FrameData objects will no longer be created, which removes the need for your changes.

I'm unable to check the full-scale app right now, but my small sample still fails (not every launch) and now in another place image the failed assertion is assert( td.samples.size() > td.ghostIdx ); I don't fully understand what causes this assertion to break at the moment.

Could the fact that now you don't immediately call Query when receiving frame with previously unknown name could cause some problems? (I haven't completely understood the protocol of your app yet). But I love the idea of not creating problematic FrameData in the first plalce.

P.S. what about the line I commented before (zrange.second - 1)? While testing my fix I've faced an OOB there, and in other places you usually don't rely on frame zrange.second to exist.

wolfpld commented 1 month ago

assert( td.samples.size() > td.ghostIdx );

This can be ignored. (Not really, but a fix is non-trivial and you get the warning in the UI that data is invalid until a save+load cycle.)

P.S. what about the line I commented before (zrange.second - 1)? While testing my fix I've faced an OOB there, and in other places you usually don't rely on frame zrange.second to exist.

Can you provide output of an asan-enabled build that shows this is a problem?

FedyuninV commented 1 month ago

assert( td.samples.size() > td.ghostIdx );

This can be ignored. (Not really, but a fix is non-trivial and you get the warning in the UI that data is invalid until a save+load cycle.)

Agreed, I've reproduced the same problem on my branch too, so it's probably unrelated with the problem in hand.

Can you provide output of an asan-enabled build that shows this is a problem?

Got it on a very same sample

=================================================================
==13108==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x1227e93e798c at pc 0x7ff7436e1176 bp 0x008ee88f9b80 sp 0x008ee88f9b88
READ of size 8 at 0x1227e93e798c thread T0
==13108==WARNING: Failed to use and restart external symbolizer!
'tracy-profiler.exe' (Win32): Loaded 'C:\Windows\System32\dbghelp.dll'. 
    #0 0x7ff7436e1175 in tracy::Worker::GetFrameEnd D:\Coding\tracy\server\TracyWorker.cpp:2124
    #1 0x7ff7435ea7bb in tracy::View::DrawTimelineFrames D:\Coding\tracy\profiler\src\profiler\TracyView_FrameTimeline.cpp:98
    #2 0x7ff74364d339 in tracy::View::DrawTimeline D:\Coding\tracy\profiler\src\profiler\TracyView_Timeline.cpp:315
    #3 0x7ff74357eee4 in tracy::View::DrawImpl D:\Coding\tracy\profiler\src\profiler\TracyView.cpp:1050
    #4 0x7ff74358747c in tracy::View::Draw D:\Coding\tracy\profiler\src\profiler\TracyView.cpp:627
    #5 0x7ff7434b41c6 in DrawContents D:\Coding\tracy\profiler\src\main.cpp:1093
    #6 0x7ff74294f5ef in std::invoke<void (__cdecl*&)(void)> C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\type_traits:1752
    #7 0x7ff742946bc9 in std::_Func_impl_no_alloc<void (__cdecl*)(void),void>::_Do_call C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\functional:839
    #8 0x7ff74295aa94 in std::_Func_class<void>::operator() C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\functional:883
    #9 0x7ff7434e6250 in Backend::Run D:\Coding\tracy\profiler\src\BackendGlfw.cpp:152
    #10 0x7ff7434b8637 in main D:\Coding\tracy\profiler\src\main.cpp:393
    #11 0x7ff7434e543e in WinMain D:\Coding\tracy\profiler\src\winmain.cpp:46
    #12 0x7ff74348f7c1 in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:102
    #13 0x7ff74348f71d in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #14 0x7ff74348f5dd in __scrt_common_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:330
    #15 0x7ff74348f83d in WinMainCRTStartup D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_winmain.cpp:16
    #16 0x7ffca2f17033 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x180017033)
    #17 0x7ffca4082650 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x180052650)

0x1227e93e798c is located 8 bytes to the right of 20-byte region [0x1227e93e7970,0x1227e93e7984)
allocated by thread T13 here:
    #0 0x7ffc12103c18 in _asan_wrap_GlobalSize+0x60544 (C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\HostX64\x64\clang_rt.asan_dynamic-x86_64.dll+0x180063c18)
    #1 0x7ff743037eac in tracy::Vector<tracy::FrameEvent>::AllocMore D:\Coding\tracy\server\TracyVector.hpp:295
    #2 0x7ff7430083ee in tracy::Vector<tracy::FrameEvent>::push_back D:\Coding\tracy\server\TracyVector.hpp:134
    #3 0x7ff742fb41dd in tracy::Worker::ProcessFrameMarkStart D:\Coding\tracy\server\TracyWorker.cpp:5052
    #4 0x7ff742faf4ba in tracy::Worker::Process D:\Coding\tracy\server\TracyWorker.cpp:4428
    #5 0x7ff742faa8ac in tracy::Worker::DispatchProcess D:\Coding\tracy\server\TracyWorker.cpp:3253
    #6 0x7ff7436c0d1b in tracy::Worker::Exec D:\Coding\tracy\server\TracyWorker.cpp:2789
    #7 0x7ff7436d0347 in `tracy::Worker::Worker'::`2'::<lambda_1>::operator() D:\Coding\tracy\server\TracyWorker.cpp:303
    #8 0x7ff7431a40d2 in std::invoke<`tracy::Worker::Worker'::`2'::<lambda_1> > C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\type_traits:1752
    #9 0x7ff743171b9e in std::thread::_Invoke<std::tuple<`tracy::Worker::Worker'::`2'::<lambda_1> >,0> C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\thread:55
    #10 0x7ffc2f94300f in register_onexit_function+0x12f (C:\Windows\SYSTEM32\ucrtbased.dll+0x1800b300f)
    #11 0x7ffc121137ce in _asan_wrap_GlobalSize+0x700fa (C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\HostX64\x64\clang_rt.asan_dynamic-x86_64.dll+0x1800737ce)
    #12 0x7ffca2f17033 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x180017033)
    #13 0x7ffca4082650 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x180052650)

Thread T13 created by T0 here:
    #0 0x7ffc121159a7 in _asan_wrap_GlobalSize+0x722d3 (C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\HostX64\x64\clang_rt.asan_dynamic-x86_64.dll+0x1800759a7)
    #1 0x7ffc2f94387e in beginthreadex+0x14e (C:\Windows\SYSTEM32\ucrtbased.dll+0x1800b387e)
    #2 0x7ff7430e161c in std::thread::_Start<`tracy::Worker::Worker'::`2'::<lambda_1> > C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\thread:73
    #3 0x7ff74307a4ac in std::thread::thread<`tracy::Worker::Worker'::`2'::<lambda_1>,0> C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\thread:88
    #4 0x7ff7436e409c in tracy::Worker::Worker D:\Coding\tracy\server\TracyWorker.cpp:303
    #5 0x7ff743587ebf in tracy::View::View D:\Coding\tracy\profiler\src\profiler\TracyView.cpp:39
    #6 0x7ff74293d6e0 in std::make_unique<tracy::View,void (__cdecl&)(std::function<void __cdecl(void)> const &,bool),char (&)[1024],unsigned short &,ImFont * &,ImFont * &,ImFont * &,void (__cdecl&)(char const *),void (__cdecl&)(float),void (__cdecl&)(void),tracy::Config &,0> C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\memory:3477
    #7 0x7ff7434b2c49 in DrawContents D:\Coding\tracy\profiler\src\main.cpp:868
    #8 0x7ff74294f5ef in std::invoke<void (__cdecl*&)(void)> C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\type_traits:1752
    #9 0x7ff742946bc9 in std::_Func_impl_no_alloc<void (__cdecl*)(void),void>::_Do_call C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\functional:839
    #10 0x7ff74295aa94 in std::_Func_class<void>::operator() C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\functional:883
    #11 0x7ff7434e6250 in Backend::Run D:\Coding\tracy\profiler\src\BackendGlfw.cpp:152
    #12 0x7ff7434b8637 in main D:\Coding\tracy\profiler\src\main.cpp:393
    #13 0x7ff7434e543e in WinMain D:\Coding\tracy\profiler\src\winmain.cpp:46
    #14 0x7ff74348f7c1 in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:102
    #15 0x7ff74348f71d in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #16 0x7ff74348f5dd in __scrt_common_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:330
    #17 0x7ff74348f83d in WinMainCRTStartup D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_winmain.cpp:16
    #18 0x7ffca2f17033 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x180017033)
    #19 0x7ffca4082650 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x180052650)

SUMMARY: AddressSanitizer: heap-buffer-overflow D:\Coding\tracy\server\TracyWorker.cpp:2124 in tracy::Worker::GetFrameEnd
Shadow bytes around the buggy address:
  0x0466e64fcee0: fa fa fa fa fa fa 00 00 00 00 fa fa fa fa fa fa
  0x0466e64fcef0: fa fa 00 00 00 00 fa fa fa fa fa fa fa fa 00 00
  0x0466e64fcf00: 00 04 fa fa fa fa fa fa fa fa 00 00 00 04 fa fa
  0x0466e64fcf10: fa fa fa fa fa fa fd fd fd fd fa fa fa fa fa fa
  0x0466e64fcf20: fa fa 00 00 00 00 fa fa fa fa fa fa fa fa 00 00
=>0x0466e64fcf30: 04[fa]fa fa fa fa fa fa fa fa 00 00 00 fa fa fa
  0x0466e64fcf40: fa fa fa fa fa fa fd fd fd fa fa fa fa fa fa fa
  0x0466e64fcf50: fa fa fd fd fd fa fa fa fa fa fa fa fa fa fd fd
  0x0466e64fcf60: fd fa fa fa fa fa fa fa fa fa fd fd fd fa fa fa
  0x0466e64fcf70: fa fa fa fa fa fa fd fd fd fa fa fa fa fa fa fa
  0x0466e64fcf80: fa fa fd fd fd fa fa fa fa fa fa fa fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
Address Sanitizer Error: Heap buffer overflow

Address Sanitizer Error: Heap buffer overflow

==13108==ABORTING
A breakpoint instruction (__debugbreak() statement or a similar call) was executed in tracy-profiler.exe.

That was the problem: if frame is not continuous, GetFrameEnd immediately tries to get fd.frames[idx].end.

wolfpld commented 1 month ago

Should be fixed by d3da7dce.

FedyuninV commented 1 month ago

Should be fixed by d3da7dc.

Small example seems stable. I won't be able to test it on full app till Monday.

FedyuninV commented 1 month ago

Your fix worked smoothly. Closing both PR and issure.

Thanks a lot!