Open dbonel opened 1 year ago
Hi! Thanks, I’ll look into this. I don’t have an M1, to test with, so might take a little bit.
Does the crash happen every single time?
On Apr 23, 2023, at 9:35 PM, dbonel @.***> wrote:
Hello! I get a crash when the Taps module is clicked in the library to add it to a patch. I'm on an M1 Mac on 12.3, running the 2.3.0 x64 version of VCV Rack. According to the Mac Console app, we have the following information about this crash: Crashed Thread: 12 com.apple.audio.IOThread.client
Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x00007f772b8d7efc Exception Codes: 0x0000000000000001, 0x00007f772b8d7efc Exception Note: EXC_CORPSE_NOTIFY
VM Region Info: 0x7f772b8d7efc is not in any region. Bytes after previous region: 34596155391741 Bytes before following region: 3722608900 REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL MALLOC_NANO (reserved) 600018000000-600020000000 [128.0M] rw-/rwx SM=NUL ...(unallocated) ---> GAP OF 0x1f77e9700000 BYTES MALLOC_TINY 7f7809700000-7f7809800000 [ 1024K] rw-/rwx SM=PRV
Thread 12: Thread 12 Crashed:: com.apple.audio.IOThread.client 0
0x104db4248 ??? 1 plugin.dylib 0x10db7b6cc Taps::check_params(float) + 44 2 libsystem_platform.dylib 0x7ff81502ae13 _sigtramp + 51 3 libRack.dylib 0x10de059df rack::engine::Module::doProcess(rack::engine::Module::ProcessArgs const&) + 143 4 libRack.dylib 0x10ddfe593 rack::engine::Engine::stepBlock(int) + 1699 5 libRack.dylib 0x10dd26be5 rack::audio::Device::processBuffer(float const, int, float, int, int) + 325 6 libRack.dylib 0x10dd5ed5f rack::RtAudioDevice::rtAudioCallback(void, void, unsigned int, double, unsigned int, void) + 143 7 libRack.dylib 0x10e25193f RtApiCore::callbackEvent(unsigned int, AudioBufferList const, AudioBufferList const) + 335 8 libRack.dylib 0x10e250935 callbackHandler(unsigned int, AudioTimeStamp const, AudioBufferList const, AudioTimeStamp const, AudioBufferList, AudioTimeStamp const, void) + 21 9 CoreAudio 0x7ff816a0ec2f HALC_ProxyIOContext::IOWorkLoop() + 7441 10 CoreAudio 0x7ff816a0c959 invocation function for block in HALC_ProxyIOContext::HALC_ProxyIOContext(unsigned int, unsigned int) + 63 11 CoreAudio 0x7ff816bd87a8 HALB_IOThread::Entry(void) + 72 12 libsystem_pthread.dylib 0x7ff8150154e1 _pthread_start + 125 13 libsystem_pthread.dylib 0x7ff815010f6b thread_start + 15 Here's what objdump -Cd -M intel on the dylib says about Taps::check_params. I believe +44 corresponds to offset 26cc. 00000000000026a0 <Taps::check_params(float)>: 26a0: 55 push rbp 26a1: 48 89 e5 mov rbp, rsp 26a4: 53 push rbx 26a5: 50 push rax 26a6: f3 0f 11 45 f4 movss dword ptr [rbp - 12], xmm0 26ab: 48 89 fb mov rbx, rdi 26ae: 48 8b 47 20 mov rax, qword ptr [rdi + 32] 26b2: 0f 10 00 movups xmm0, xmmword ptr [rax] 26b5: 0f 11 87 ec e8 4c 00 movups xmmword ptr [rdi + 5040364], xmm0 26bc: 0f 10 40 10 movups xmm0, xmmword ptr [rax + 16] 26c0: 0f 11 87 50 e9 4c 00 movups xmmword ptr [rdi + 5040464], xmm0 26c7: e8 d4 02 00 00 call 0x29a0 <Taps::handle_clock_input()> 26cc: 48 8b 43 38 mov rax, qword ptr [rbx + 56] 26d0: f3 0f 10 80 f0 00 00 00 movss xmm0, dword ptr [rax + 240] ## xmm0 = mem[0],zero,zero,zero 26d8: f3 0f 5a c0 cvtss2sd xmm0, xmm0 26dc: f2 0f 59 05 3c 46 00 00 mulsd xmm0, qword ptr [rip + 17980] ## 0x6d20 <dyld_stub_binder+0x6d20> 26e4: f3 0f 10 8b 54 e9 4c 00 movss xmm1, dword ptr [rbx + 5040468] ## xmm1 = mem[0],zero,zero,zero 26ec: f3 0f 5a c9 cvtss2sd xmm1, xmm1 26f0: f2 0f 59 c8 mulsd xmm1, xmm0 26f4: 0f 57 c0 xorps xmm0, xmm0 26f7: f2 0f 5a c1 cvtsd2ss xmm0, xmm1 26fb: f3 0f 58 83 f0 e8 4c 00 addss xmm0, dword ptr [rbx + 5040368] 2703: f3 0f 11 83 f0 e8 4c 00 movss dword ptr [rbx + 5040368], xmm0 270b: f3 0f 10 0d e5 45 00 00 movss xmm1, dword ptr [rip + 17893] ## xmm1 = mem[0],zero,zero,zero
0x6cf8 <dyld_stub_binder+0x6cf8>
2713: 0f 2e c1 ucomiss xmm0, xmm1 2716: 77 08 ja 0x2720 <Taps::check_params(float)+0x80> (...)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Yes, it crashes every time.
Great! That bodes well for tracking this down :)
On Apr 23, 2023, at 10:01 PM, dbonel @.***> wrote:
Yes, it crashes every time. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Oh, one more question, have you tried adding it to a completely empty patch?
Oh interesting, adding it to an empty patch does work. Let me see if I can narrow it down to a set of modules that it crashes when added to.
That happened to me, I was chasing down an issue with module custom default settings that just didn't seem to work on my system and it was because I was using MB as my library manager. I only realized when I tested it on a blank patch and discovered different behavior.
If it's a really big patch, I recommend a binary search. ie, delete half the patch and see if that fixes it -- if so, you've narrowed down the search space by half, and then repeat until the culprit is found. :)
That was the strategy I was planning to try, but I'm now only sometimes getting crashes when adding an instance to the existing patch, which makes this a lot more complicated to try to debug. If there's a pattern to when it crashes, it's not apparent to me so far.
I've been able to get it to crash just once in a patch that started out empty, and then had a Taps instance added and then duplicated with command-D a few times, but I can't seem to replicate that either now.
I think I now have a process that eventually leads to a crash, which is to load a big existing patch, do command-N to start a new patch from my default template, delete all of those modules, and then add a Taps and duplicate it repeatedly until eventually it crashes.
I was able to get this to happen with Rack running inside of lldb, and that implicates something inside the Taps::process()
function (which isn't narrowing it down much).
Process 24307 stopped
* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x20f988580)
frame #0: 0x00000001208a3409 plugin.dylib`Taps::process(rack::engine::Module::ProcessArgs const&) + 441
plugin.dylib`Taps::process:
-> 0x1208a3409 <+441>: movss %xmm1, 0x148(%rbx,%r10,4)
0x1208a3413 <+451>: movslq 0x4ce8d4(%rbx), %r9
0x1208a341a <+458>: movss 0x148(%rbx,%r9,4), %xmm0 ; xmm0 = mem[0],zero,zero,zero
0x1208a3424 <+468>: movl 0x4ce904(%rbx), %r14d
Target 0: (Rack) stopped.
(lldb) bt
* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x20f988580)
* frame #0: 0x00000001208a3409 plugin.dylib`Taps::process(rack::engine::Module::ProcessArgs const&) + 441
frame #1: 0x000000010917e9df libRack.dylib`rack::engine::Module::doProcess(rack::engine::Module::ProcessArgs const&) + 143
frame #2: 0x0000000109177593 libRack.dylib`rack::engine::Engine::stepBlock(int) + 1699
frame #3: 0x000000010917c4a4 libRack.dylib`rack::engine::Engine_fallbackRun(rack::engine::Engine*) + 340
frame #4: 0x000000010917d09c libRack.dylib`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(rack::engine::Engine*), rack::engine::Engine*> >(void*) + 44
frame #5: 0x00007ff8150154e1 libsystem_pthread.dylib`_pthread_start + 125
frame #6: 0x00007ff815010f6b libsystem_pthread.dylib`thread_start + 15
(lldb) register read rbx
rbx = 0x00000001153b8000
(lldb) register read r10
r10 = 0x000000003e97410e
I think if we can get a version of the plugin.dylib file with debug symbols still included, it will be a lot easier to match this up to a line in the file.
However, here's the instruction we're crashing on in the disassembled function: https://gist.github.com/dbonel/6ffa16ef44c63349af1c9b0b0addffa2#file-gistfile1-txt-L122
Can inptr
be uninitialized here, especially on the first iteration? https://github.com/demcanulty/Yokai_Holiday/blob/main/src/Taps.cpp#L551
Thanks for continuing to look into this, I wish I could compile on an M1 and test. My assumption was that inptr will be initialized, since it's a global variable within the struct, but woahh, yeah, it looks like I might have been wrong, and there could be a good chance that it's not!
Are you able to compile the module on your system? Looks like all of those variables should be initialized.
Good catch!
If you can't compile, I think there's a way to do cross compilation here on github.
I'll have to look at getting a C++ toolchain set up, which hopefully isn't too much trouble.
I will note I don't think the M1 is the unique part of this (since my VCV Rack install is all x86 code running through the Rosetta translation layer), however the macOS allocator might be arranging the mappings such that a stray memory access is more likely to land in an unmapped region.
Ah, that's a great point, if you give me your duplication method, or even the original patch that pointed you to the issue, I'll see if I can duplicate here. I had some inexplicable crashes early on, so I coded a bunch of protection code. That whole soft start at line 529 is exactly that, me trying to avoid those crashes. So maybe the problem is here too, I just need a way to reliably trigger it.
But also if you want any help getting your VCV toolchain up and running, it seems like you'd have a good time with it. I don't have the debugging skills you've got to capture these things, I'm a firmware guy most of the time, and trying to get these disassemblies that you've got on an x86 system is still a bit of a mysterious world to me.
But, that said, I do have debugging working on a mac! Which seemed like a bit of a magic trick, I'm not sure I can duplicate it, but I'd be game to try.
Okay, I already had a C++ toolchain so the only tricky part was the cross-compile. It looks like I guessed right here, because with debug symbols I get this:
Process 25626 stopped
* thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x50139364)
frame #0: 0x000000010e7130ea plugin.dylib`Taps::process(this=0x0000000162688000, args=<unavailable>) at Taps.cpp:551:29 [opt]
548 int trail_look_ahead = (inptr + fade_buffer_lookahead) % target_buffer_length;
549
550 trail_buffer[trailptr] = delay_buffer[trail_look_ahead];
-> 551 delay_buffer[inptr] = dry;
552 forw_wet = delay_buffer[forw_outptr];
553
554
And then, looking at the value of inptr
:
(lldb) frame variable this->inptr
(int) this->inptr = -1150630777
(lldb) frame variable &this->delay_buffer
(float (*)[960000]) &delay_buffer = 0x0000000162688148
So that definitely lines up, 0x0000000162688148 + (-1150630777 * sizeof(float)) gives us the 0x50139364 address that generated the segfault.
It might vary by OS platform exactly what the allocator does as far as zeroing memory automatically or not, but apparently it's not safe here to assume anything is going to be initialized for you.
Oooh, yeah, that's great work, nicely done!! In C I'm really careful about which variables are initialized and not, but my instincts were wrong for C++. Thanks for finding this!
That's a gnarly un-initialized variable, super embarrassing.
You're welcome, I'm glad I could actually help. I'm not really familiar with C++ semantics either, having spent most of my recent time with Rust where we're not allowed to do any of these naughty things. :)
For good measure I initialized all variables at the top-level scope in the struct Taps declaration and submitted a new version to VCV tonight. I'll check in when it gets released to make sure it passes your test. Thanks again!
I compiled and lightly tested the v2.0.0.8 version you pushed here, with no crashes so far.
Oh, that's fantastic news! Thanks! Yeah, I really had wondered why sometimes I saw it crash when I loaded the module. My own fault for not reading up properly on C++ initialization guarantees. It's a satisfying solution to that mystery. I'm really surprised the compiler didn't notify me about that, because it did notify me about other potentially uninitialized variables.
Hello! I get a crash when the Taps module is clicked in the library to add it to a patch. I'm on an M1 Mac on 12.3, running the 2.3.0 x64 version of VCV Rack.
According to the Mac Console app, we have the following information about this crash:
Thread 12:
Here's what
objdump -Cd -M intel
on the dylib says aboutTaps::check_params
. I believe +44 corresponds to offset26cc
.