luigifcruz / CyberEther

Multi-platform GPU-accelerated interface for compute-intensive pipelines. Radio, the final frontier.
MIT License
407 stars 14 forks source link

Crashes on node delete #61

Closed electron271 closed 8 months ago

electron271 commented 8 months ago

using the cyberether-git aur package

JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: FFT | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Multiply | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Multiply | CPU | F32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Invert | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: MultiplyConstant | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: MultiplyConstant | CPU | F32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Lineplot | CPU | F32
JETSTREAM [DEBUG] | [COMPOSITOR] Refreshing interface state.
JETSTREAM [DEBUG] | [INSTANCE] Started.
JETSTREAM [DEBUG] | [INSTANCE] Building default viewport and render.
JETSTREAM [DEBUG] | Initializing Vulkan backend.
JETSTREAM [WARN]  | [VULKAN] Couldn't find validation layers. Disabling Vulkan debug.
JETSTREAM [DEBUG] | [VULKAN] Supported required instance extensions: ["VK_KHR_surface", "VK_KHR_xcb_surface"]
JETSTREAM [DEBUG] | [VULKAN] Supported required device extensions: ["VK_KHR_swapchain"]
JETSTREAM [DEBUG] | [VULKAN] Supported optional device extensions: {"VK_EXT_external_memory_host", "VK_KHR_external_memory_fd"}
JETSTREAM [INFO]  | -----------------------------------------------------
JETSTREAM [INFO]  | Jetstream Heterogeneous Backend [VULKAN]
JETSTREAM [INFO]  | -----------------------------------------------------
JETSTREAM [INFO]  | Device Name:     AMD Radeon RX 7600 (RADV NAVI33)
JETSTREAM [INFO]  | Device Type:     DISCRETE
JETSTREAM [INFO]  | API Version:     1.3.267
JETSTREAM [INFO]  | Unified Memory:  NO
JETSTREAM [INFO]  | Processor Count: 20
JETSTREAM [INFO]  | Device Memory:   8.00 GB
JETSTREAM [INFO]  | Staging Buffer:  64.00 MB
JETSTREAM [INFO]  | -----------------------------------------------------
JETSTREAM [DEBUG] | [VULKAN] Creating GLFW viewport.
JETSTREAM [DEBUG] | [VULKAN] Swap mailbox presentation mode is available.
JETSTREAM [DEBUG] | [VULKAN] Creating window.
JETSTREAM [DEBUG] | [VULKAN] Creating ImGui.
JETSTREAM [DEBUG] | [COMPOSITOR] Loading assets.
JETSTREAM [DEBUG] | [VULKAN] Creating texture.
JETSTREAM [DEBUG] | [VULKAN] Creating texture.
JETSTREAM [DEBUG] | [VULKAN] Swap mailbox presentation mode is available.
JETSTREAM [INFO]  | [FLOWGRAPH] Creating flowgraph in-memory.
JETSTREAM [DEBUG] | [COMPOSITOR] Running graph auto-route.
JETSTREAM [DEBUG] | [INSTANCE] Adding new block 'cas0'.
JETSTREAM [DEBUG] | [INSTANCE] Adding new module 'cas0-cast'.
JETSTREAM [DEBUG] | Initializing Cast module.
JETSTREAM [ERROR] | Input is empty during initialization.
JETSTREAM [DEBUG] | [INSTANCE] Module 'cas0-cast' is incomplete.
JETSTREAM [DEBUG] | [COMPOSITOR] Adding block 'cas0'.
JETSTREAM [DEBUG] | [COMPOSITOR] Refreshing interface state.
/usr/include/c++/13.2.1/optional:477: constexpr _Tp& std::_Optional_base_impl<_Tp, _Dp>::_M_get() [with _Tp = Jetstream::Locale; _Dp = std::_Optional_base<Jetstream::Locale, false, false>]: Assertion 'this->_M_is_engaged()' failed.
Aborted (core dumped)

output of cyberether -v

JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: FFT | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Multiply | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Multiply | CPU | F32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Invert | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: MultiplyConstant | CPU | CF32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: MultiplyConstant | CPU | F32
JETSTREAM [DEBUG] | [BENCHMARK] Adding benchmark: Lineplot | CPU | F32
CyberEther v1.0.0-release
luigifcruz commented 8 months ago

That's new for me! I'll take a look.

luigifcruz commented 8 months ago

Do you mind sharing a backtrace of this issue? Just to make sure I properly patched the bug.

$ gdb ./cyberether
> r
# wait for it to crash
> bt
electron271 commented 8 months ago

yeah 1 second

electron271 commented 8 months ago

had to recompile it as the aur package uses the release compilation, i was unable to reproduce it by compiling it manually

https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=cyberether-git

also adding @armv8-a as they made the pkgbuild

  510  git clone https://github.com/luigifcruz/CyberEther.git
  511  cd CyberEther/
  512  paru glslang
  513  paru glfw
  514  meson setup -Dbuildtype=debugoptimized build && cd build
  515  ninja -C build # didnt work
  516  ninja build # didnt work
  517  ninja # was the command i used
  518  ls
  519  gdb ./cyberether
electron271 commented 8 months ago

running gdb on aur package built with debug mode

[electron271@saturn cyberether-git]$ gdb cyberether
GNU gdb (GDB) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from cyberether...

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) n
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
(No debugging symbols found in cyberether)
(gdb) r
Starting program: /usr/bin/cyberether 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffe92006c0 (LWP 1555747)]
JETSTREAM [INFO]  | -----------------------------------------------------
JETSTREAM [INFO]  | Jetstream Heterogeneous Backend [VULKAN]
JETSTREAM [INFO]  | -----------------------------------------------------
JETSTREAM [INFO]  | Device Name:     AMD Radeon RX 7600 (RADV NAVI33)
JETSTREAM [INFO]  | Device Type:     DISCRETE
JETSTREAM [INFO]  | API Version:     1.3.267
JETSTREAM [INFO]  | Unified Memory:  NO
JETSTREAM [INFO]  | Processor Count: 20
JETSTREAM [INFO]  | Device Memory:   8.00 GB
JETSTREAM [INFO]  | Staging Buffer:  64.00 MB
JETSTREAM [INFO]  | -----------------------------------------------------
[New Thread 0x7fffdfe006c0 (LWP 1555748)]
[New Thread 0x7fffda0006c0 (LWP 1555749)]
[New Thread 0x7fffd96006c0 (LWP 1555750)]
[Thread 0x7fffdfe006c0 (LWP 1555748) exited]
[New Thread 0x7fffdfe006c0 (LWP 1555751)]
[New Thread 0x7fffd8c006c0 (LWP 1555752)]
JETSTREAM [INFO]  | [FLOWGRAPH] Creating flowgraph in-memory.
[Thread 0x7fffd8c006c0 (LWP 1555752) exited]
[New Thread 0x7fffd8c006c0 (LWP 1555753)]
JETSTREAM [ERROR] | Input is empty during initialization.
[Thread 0x7fffd8c006c0 (LWP 1555753) exited]
[New Thread 0x7fffd8c006c0 (LWP 1555754)]
/usr/include/c++/13.2.1/optional:477: constexpr _Tp& std::_Optional_base_impl<_Tp, _Dp>::_M_get() [with _Tp = Jetstream::Locale; _Dp = std::_Optional_base<Jetstream::Locale, false, false>]: Assertion 'this->_M_is_engaged()' failed.

Thread 9 "cyberether" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffd8c006c0 (LWP 1555754)]
0x00007ffff70ac83c in ?? () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff70ac83c in  () at /usr/lib/libc.so.6
#1  0x00007ffff705c668 in raise () at /usr/lib/libc.so.6
#2  0x00007ffff70444b8 in abort () at /usr/lib/libc.so.6
#3  0x00007ffff72dd3b2 in std::__glibcxx_assert_fail(char const*, int, char const*, char const*) (file=<optimized out>, line=<optimized out>, function=<optimized out>, condition=<optimized out>)
    at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:61
#4  0x00007ffff7768d30 in  () at /usr/lib/libjetstream.so
#5  0x00007ffff72e1943 in std::execute_native_thread_routine(void*) (__p=0x7fffd4019330) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104
#6  0x00007ffff70aa9eb in  () at /usr/lib/libc.so.6
#7  0x00007ffff712e7cc in  () at /usr/lib/libc.so.6
(gdb) 
luigifcruz commented 8 months ago

That's interesting. The GDB didn't help too much because all the symbols were optimized away. This is probably because the AUR pkg-config configured it that way. This is also likely why it's not crashing when you compile manually.

Does this happen with every block or just some in particular?

electron271 commented 8 months ago

every it seems

something to note is that it says stripped /usr/bin/cyberether: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1aa7383a2c5ad0f212a1ebc72a81669b1feaa120, for GNU/Linux 4.4.0, stripped, going to rebuild with debug instead of debug optimized

armv8-a commented 8 months ago

I am unable to reproduce this with the AUR PKGBUILD, but it's likely something to do with my building/packaging since they can't reproduce it with the manual build

luigifcruz commented 8 months ago

There is a tiny chance of this being caused by some peculiarity of the RADEON driver and CyberEther's Vulkan render. Unfortunately, I don't have a similar hardware to verify. But I'm writing a patch with stronger assertions to at least identify the root cause of this problem.

luigifcruz commented 8 months ago

I just searched the GitHub global issue tracker, and there are a lot of recent bug reports with the same crash signature. I'm pending towards something wrong with libstdc++.

luigifcruz commented 8 months ago

Nope, I screw up while doing a refactor some months ago. I was using a std::optional value in an asynchronous call that was subsequently cleared by the main thread. Anyway, it should be fixed in the v1.0.0-alpha2 branch.

electron271 commented 8 months ago

testing a new aur package using the new branch

electron271 commented 8 months ago

confirmed to work, some slight issues but i am unsure if they are a thing with the new thing

will investigate further tomorrow