ngscopeclient / scopehal-apps

ngscopeclient and other client applications for libscopehal.
https://www.ngscopeclient.org/
BSD 3-Clause "New" or "Revised" License
534 stars 83 forks source link

Unexpected closure of LAN socket causes crash #694

Open DanielO opened 4 months ago

DanielO commented 4 months ago

I wrote a [buggy] USB TMC to LAN bridge and if it closes the connection unexpectedly scopehal crashes: App output:

[mvk-info] Created 2 swapchain images with size (1280, 720) and contents scale 1.0 in layer (null) (0x6000012c4e10) on screen VA2719-2K.
Warning: Socket read failed (errno=35, Resource temporarily unavailable)
Warning: Bad IDN response
    Warning: Socket read failed (errno=35, Resource temporarily unavailable)
    Warning: Socket read failed (errno=35, Resource temporarily unavailable)
    ERROR: SCPI resync failed, firmware is probably in a bad state. Try rebooting the scope.
libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

Crash dump:

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib              0x7ff80fc987a2 __pthread_kill + 10
1   libsystem_pthread.dylib             0x7ff80fcd0f30 pthread_kill + 262
2   libsystem_c.dylib                   0x7ff80fbefa49 abort + 126
3   libc++abi.dylib                     0x7ff80fc89c72 abort_message + 241
4   libc++abi.dylib                     0x7ff80fc7be00 demangling_terminate_handler() + 240
5   libobjc.A.dylib                     0x7ff80f91a476 _objc_terminate() + 104
6   libc++abi.dylib                     0x7ff80fc890cb std::__terminate(void (*)()) + 6
7   libc++abi.dylib                     0x7ff80fc89086 std::terminate() + 54
8   libscopehal.dylib                      0x1063430d3 PipelineCacheManager::~PipelineCacheManager() + 179 (PipelineCacheManager.cpp:65)
9   libscopehal.dylib                      0x10634193d PipelineCacheManager::~PipelineCacheManager() + 8 (PipelineCacheManager.cpp:64) [inlined]
10  libscopehal.dylib                      0x10634193d std::__1::default_delete<PipelineCacheManager>::operator()[abi:ue170006](PipelineCacheManager*) const + 8 (unique_ptr.h:68) [inlined]
11  libscopehal.dylib                      0x10634193d std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::reset[abi:ue170006](PipelineCacheManager*) + 25 (unique_ptr.h:300) [inlined]
12  libscopehal.dylib                      0x10634193d std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006]() + 25 (unique_ptr.h:266) [inlined]
13  libscopehal.dylib                      0x10634193d std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006]() + 29 (unique_ptr.h:266)
14  libsystem_c.dylib                   0x7ff80fb9a5b1 __cxa_finalize_ranges + 402
15  libsystem_c.dylib                   0x7ff80fb9a3d2 exit + 35
16  libscopehal.dylib                      0x1062e5e8c TektronixOscilloscope::ResynchronizeSCPI() + 1500 (TektronixOscilloscope.cpp:1694)
17  libscopehal.dylib                      0x1062e70ca TektronixOscilloscope::TektronixOscilloscope(SCPITransport*) + 1098 (TektronixOscilloscope.cpp:78)
18  libscopehal.dylib                      0x106111b42 TektronixOscilloscope::CreateInstance(SCPITransport*) + 34 (TektronixOscilloscope.h:384)
19  libscopehal.dylib                      0x1061b73c4 Oscilloscope::CreateOscilloscope(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, SCPITransport*) + 84 (Oscilloscope.cpp:98)
20  ngscopeclient                          0x10491d5fe AddScopeDialog::DoConnect() + 590 (AddScopeDialog.cpp:83)
21  ngscopeclient                          0x10491b205 AddInstrumentDialog::DoRender() + 2341 (AddInstrumentDialog.cpp:123)
22  ngscopeclient                          0x10492720d Dialog::Render() + 317 (Dialog.cpp:80)
23  ngscopeclient                          0x10495b137 MainWindow::RenderUI() + 2007 (MainWindow.cpp:589)
24  ngscopeclient                          0x104a63bd3 VulkanWindow::Render() + 339 (VulkanWindow.cpp:458)
25  ngscopeclient                          0x104aa5c32 main + 850 (main.cpp:127)
26  dyld                                0x7ff80f946386 start + 1942
azonenberg commented 4 months ago

Interesting. Can you do "catch throw" in gdb and get a backtrace at the point that exception is thrown?

DanielO commented 4 months ago

I'm using lldb so I used 'break set -E C++':

Process 20709 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00007ff80fc8b9ea libc++abi.dylib`__cxa_throw
libc++abi.dylib`:
->  0x7ff80fc8b9ea <+0>: pushq  %rbp
    0x7ff80fc8b9eb <+1>: movq   %rsp, %rbp
    0x7ff80fc8b9ee <+4>: pushq  %r15
    0x7ff80fc8b9f0 <+6>: pushq  %r14
Target 0: (ngscopeclient) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00007ff80fc8b9ea libc++abi.dylib`__cxa_throw
    frame #1: 0x00007ff80fc15b83 libc++.1.dylib`std::__1::__throw_system_error(int, char const*) + 77
    frame #2: 0x00007ff80fc0b87d libc++.1.dylib`std::__1::mutex::lock() + 29
    frame #3: 0x0000000101dd7fc0 libscopehal.dylib`LogDebugTrace(char const*, char const*, ...) [inlined] std::__1::lock_guard<std::__1::mutex>::lock_guard[abi:ue170006](this=<unavailable>, __m=<unavailable>) at lock_guard.h:35:10 [opt]
    frame #4: 0x0000000101dd7fb4 libscopehal.dylib`LogDebugTrace(char const*, char const*, ...) [inlined] std::__1::lock_guard<std::__1::mutex>::lock_guard[abi:ue170006](this=<unavailable>, __m=<unavailable>) at lock_guard.h:34:19 [opt]
    frame #5: 0x0000000101dd7fb4 libscopehal.dylib`LogDebugTrace(function="void PipelineCacheManager::SaveToDisk()", format="Saving cache\n") at log.cpp:303:20 [opt]
    frame #6: 0x0000000101d42fff libscopehal.dylib`PipelineCacheManager::SaveToDisk(this=0x00006000009e0500) at PipelineCacheManager.cpp:297:2 [opt]
    frame #7: 0x0000000101d42f13 libscopehal.dylib`PipelineCacheManager::~PipelineCacheManager(this=<unavailable>) at PipelineCacheManager.cpp:65:2 [opt]
    frame #8: 0x0000000101d4181d libscopehal.dylib`std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006]() [inlined] PipelineCacheManager::~PipelineCacheManager(this=0x00006000009e0500) at PipelineCacheManager.cpp:64:1 [opt]
    frame #9: 0x0000000101d41815 libscopehal.dylib`std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006]() [inlined] std::__1::default_delete<PipelineCacheManager>::operator()[abi:ue170006](this=<unavailable>, __ptr=0x00006000009e0500) const at unique_ptr.h:68:5 [opt]
    frame #10: 0x0000000101d41815 libscopehal.dylib`std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006]() [inlined] std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::reset[abi:ue170006](this=<unavailable>, __p=0x0000000000000000) at unique_ptr.h:300:7 [opt]
    frame #11: 0x0000000101d41804 libscopehal.dylib`std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006]() [inlined] std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006](this=<unavailable>) at unique_ptr.h:266:75 [opt]
    frame #12: 0x0000000101d41804 libscopehal.dylib`std::__1::unique_ptr<PipelineCacheManager, std::__1::default_delete<PipelineCacheManager>>::~unique_ptr[abi:ue170006](this=<unavailable>) at unique_ptr.h:266:73 [opt]
    frame #13: 0x00007ff80fb9a5b1 libsystem_c.dylib`__cxa_finalize_ranges + 402
    frame #14: 0x00007ff80fb9a3d2 libsystem_c.dylib`exit + 35
    frame #15: 0x0000000101ce521c libscopehal.dylib`TektronixOscilloscope::ResynchronizeSCPI(this=0x00007fbabc06ca00) at TektronixOscilloscope.cpp:1700:3 [opt]
    frame #16: 0x0000000101ce6472 libscopehal.dylib`TektronixOscilloscope::TektronixOscilloscope(this=0x00007fbabc06ca00, transport=<unavailable>) at TektronixOscilloscope.cpp:79:2 [opt]
    frame #17: 0x0000000101b10a82 libscopehal.dylib`TektronixOscilloscope::CreateInstance(transport=0x00007fbabad2c510) at TektronixOscilloscope.h:385:2 [opt]
    frame #18: 0x0000000101bb6304 libscopehal.dylib`Oscilloscope::CreateOscilloscope(driver="tektronix", transport=0x00007fbabad2c510) at Oscilloscope.cpp:98:10 [opt]
    frame #19: 0x0000000100431cf9 ngscopeclient`Session::PreLoadOscilloscope(this=0x00007fbabb01b250, version=2, node=0x00007ff7bfefe848, online=<unavailable>) at Session.cpp:1038:13 [opt]
    frame #20: 0x00000001003fd704 ngscopeclient`Session::PreLoadInstruments(this=0x00007fbabb01b250, version=2, node=0x00007ff7bfefe918, online=true) at Session.cpp:880:8 [opt]
    frame #21: 0x00000001003fcc0b ngscopeclient`Session::PreLoadFromYaml(this=0x00007fbabb01b250, node=0x0000600002d8bcc0, (null)=<unavailable>, online=true) at Session.cpp:305:6 [opt]
    frame #22: 0x0000000100377f17 ngscopeclient`MainWindow::PreLoadSessionFromYaml(this=0x00007fbabb01ae00, node=0x0000600002d8bcc0, dataDir="@\xb5\xa0G\x90q\0\0\xfd\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", online=true) at MainWindow.cpp:2106:16 [opt]
    frame #23: 0x00000001003767ec ngscopeclient`MainWindow::DoOpenFile(this=0x00007fbabb01ae00, sessionPath="/Users/oconnd1/projects/scopehal-apps/build/test2.scopesession", online=true) at MainWindow.cpp:2032:7 [opt]
    frame #24: 0x000000010036ab66 ngscopeclient`MainWindow::RenderFileBrowser(this=0x00007fbabb01ae00) at MainWindow.cpp:1986:6 [opt]
    frame #25: 0x00000001003582b0 ngscopeclient`MainWindow::RenderUI(this=0x00007fbabb01ae00) at MainWindow.cpp:614:3 [opt]
    frame #26: 0x0000000100460bd3 ngscopeclient`VulkanWindow::Render(this=0x00007fbabb01ae00) at VulkanWindow.cpp:458:2 [opt]
    frame #27: 0x00000001004a2c32 ngscopeclient`main(argc=<unavailable>, argv=<unavailable>) at main.cpp:127:18 [opt]
    frame #28: 0x00007ff80f946386 dyld`start + 1942
bvernoux commented 4 months ago

It is not a "crash" Related to first logs

...
15  libsystem_c.dylib                   0x7ff80fb9a3d2 exit + 35
16  libscopehal.dylib                      0x1062e5e8c TektronixOscilloscope::ResynchronizeSCPI() + 1500 (TektronixOscilloscope.cpp:1694)
...

see https://github.com/ngscopeclient/scopehal/blob/20bab5bd204b80e4f142180648e4f2c3177e97d3/scopehal/TektronixOscilloscope.cpp#L1693C1-L1694C11 it is something wrong with a bad state then it call exit(1) ...

        LogError("SCPI resync failed, firmware is probably in a bad state. Try rebooting the scope.\n");
        exit(1);

Then on second log

...
    frame #14: 0x00007ff80fb9a3d2 libsystem_c.dylib`exit + 35
    frame #15: 0x0000000101ce521c libscopehal.dylib`TektronixOscilloscope::ResynchronizeSCPI(this=0x00007fbabc06ca00) at TektronixOscilloscope.cpp:1700:3 [opt]
...

I suspect it was same error the strange things is that line 1700 in TektronixOscilloscope.cpp ...

Unfortunately I cannot help as I do not have any TektronixOscilloscope to test ... You shall add more details on what is the exact Tektronix Oscilloscope used with Firmware version (which could be buggy) and also provide the full Log Error at output ...

azonenberg commented 4 months ago

It is a crash, but a crash during exit (we're getting some weirdness with an unhandled exception in the PipelineCacheManager destructor).

In either case, the root cause of the exit is the Tek driver panicking when it loses contact with the scope. We should probably make this more graceful in the near term (ultimately falling back to a true offline mode as per #372).