wyhaines / opentelemetry-api.cr

The core of open telemetry instrumentation is the OpenTelemetry API/SDK. The initial aim of this shard is to implement the OpenTelemetry specification for metrics, traces, and logs.
Apache License 2.0
12 stars 1 forks source link

Invalid memory access in v0.3.0 #12

Closed robcole closed 2 years ago

robcole commented 2 years ago
Invalid memory access (signal 11) at address 0x8
[0x10d72a9db] *Exception::CallStack::print_backtrace:Nil +107 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x10d6dc02d] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil)@/usr/local/Cellar/crystal/1.4.1/src/signal.cr:127 +285 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x7ff81595bdfd] _sigtramp +29 in /usr/lib/system/libsystem_platform.dylib
[0x10dce3be9] *Hash(String, BinData::BitField)@Hash(K, V)#upsert<String, BinData::BitField>:(Hash::Entry(String, BinData::BitField) | Nil) +57 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x10dce3b9e] *Hash(String, BinData::BitField)@Hash(K, V)#[]=<String, BinData::BitField>:BinData::BitField +30 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x10d6bf291] __crystal_main +5457 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x10dd940d9] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +9 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x10dd94048] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +40 in /Users/robcole/dev/hirobot.app/bin/start_server
[0x10d6ce159] main +9 in /Users/robcole/dev/hirobot.app/bin/start_server

I haven't tracked down which commit caused the issue yet, but here's a repo that can demonstrate the bug + how commenting out OpenTel fixes it, currently:

https://github.com/the-business-factory/hirobot.app/commit/a2c04e25e5a6b46027470956d4b82d003a091bf7

Replication: from that branch, you should be able to check out that commit / commits before and attempt to run lucky watch or lucky dev -- the error will also be triggered in release apps.

robcole commented 2 years ago

https://github.com/wyhaines/opentelemetry-api.cr/commit/10629ce9cce2aaef4a6f9e66f306cf2d52dd3fc5 appears to be the PR where this is broken -- I can manually check out the previous commit and run Hirobot.app.

wyhaines commented 2 years ago

I am going to close this because we have had no luck in making a reproducible test case that exhibits the problem. If it rears its head again, though, please reopen.

jwoertink commented 2 years ago

This looks similar to what I'm seeing:

❯ ./bin/start_server 
Invalid memory access (signal 11) at address 0x8
[0x5610fb0db986] *Exception::CallStack::print_backtrace:Nil +118 in ./bin/start_server
[0x5610faeb518e] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil) +398 in ./bin/start_server
[0x7f9533b2d520] ?? +140278794212640 in /lib/x86_64-linux-gnu/libc.so.6
[0x5610fb6a320a] *Hash(String, BinData::BitField) +90 in ./bin/start_server
[0x5610fb6a31a0] *Hash(String, BinData::BitField) +80 in ./bin/start_server
[0x5610fae8c73c] __crystal_main +4588 in ./bin/start_server
[0x5610fcdb7bfd] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +45 in ./bin/start_server
[0x5610fcdb7b2d] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +77 in ./bin/start_server
[0x5610faea235d] main +45 in ./bin/start_server
[0x7f9533b14fd0] ?? +140278794112976 in /lib/x86_64-linux-gnu/libc.so.6
[0x7f9533b1507d] __libc_start_main +125 in /lib/x86_64-linux-gnu/libc.so.6
[0x5610fae8b485] _start +37 in ./bin/start_server
[0x0] ???

Running it through gdb with debug flag on, I got this:

Reading symbols from ./bin/start_server...
(gdb) r
Starting program: /home/jeremy/Sites/joysticktv/bin/start_server 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x0000555557c2a2fe in GC_find_limit_with_bound ()
(gdb) bt
#0  0x0000555557c2a2fe in GC_find_limit_with_bound ()
#1  0x0000555557c2a156 in GC_init_linux_data_start ()
#2  0x0000555557c27b95 in GC_init ()
#3  0x0000555556031e23 in init () at /usr/share/crystal/src/gc/boehm.cr:146
#4  0x0000555557c1cb1f in main (argc=1, argv=0x7fffffffdf28)
    at /usr/share/crystal/src/crystal/main.cr:35
#5  0x0000555555d0735d in main (argc=1, argv=0x7fffffffdf28)
    at /usr/share/crystal/src/crystal/main.cr:127
(gdb) bt
#0  upsert (self=0x0, key=0x555557cabbb0, value=0x7ffff54c7340) at /usr/share/crystal/src/hash.cr:334
#1  0x00005555565081a0 in []= (self=0x0, key=0x555557cabbb0, value=0x7ffff54c7340)
    at /usr/share/crystal/src/hash.cr:1000
#2  0x0000555555cf173c in __crystal_main ()
    at /home/jeremy/Sites/joysticktv/lib/bindata/src/bindata/asn1/identifier.cr:12
#3  0x0000555557c1cbfd in main_user_code (argc=1, argv=0x7fffffffdf28)
    at /usr/share/crystal/src/crystal/main.cr:115
#4  0x0000555557c1cb2d in main (argc=1, argv=0x7fffffffdf28)
    at /usr/share/crystal/src/crystal/main.cr:101
#5  0x0000555555d0735d in main (argc=1, argv=0x7fffffffdf28)
    at /usr/share/crystal/src/crystal/main.cr:127
(gdb) f 2
#2  0x0000555555cf173c in __crystal_main ()
    at /home/jeremy/Sites/joysticktv/lib/bindata/src/bindata/asn1/identifier.cr:12
12        bit_field do

Maybe related to this: https://github.com/spider-gazelle/bindata/blob/7098dbd15447755d8dda44d7e0033caf249207a5/src/bindata.cr#L436 ?

As for recreating the error outside of my app, that will be a bit harder, but if I find more info, I'll post here.