nikita-volkov / postgresql-binary

Encoders and decoders for the PostgreSQL's binary format
http://hackage.haskell.org/package/postgresql-binary
MIT License
21 stars 13 forks source link

unit test segfault (jsonb rountrip on macOS) #18

Closed robx closed 2 years ago

robx commented 2 years ago

The tasty test-suite fails locally for me on current master with segmentation faults:

$ cabal run tasty -- -p jsonb
Up to date

  Binary format
    jsonb roundtrip: Segmentation fault: 11

Sometimes, I also see

$ cabal run tasty -- -p jsonb
Up to date
Segmentation fault: 11

(I'm guessing this still runs the test, and that the output is just different due to buffering, but not entirely sure.)

With some tracing statements and sleeps along the following lines:

 roundtrip :: (Show a, Eq a) => 
   LibPQ.Oid -> (Bool -> (a -> B.Encoding)) -> (Bool -> A.Value a) -> a -> Property
 roundtrip oid encoder decoder value =
-  Right value === unsafePerformIO (IO.roundtrip oid encoder decoder value)
+  Right value === unsafePerformIO (do
+    putStrLn $ "tripping: " <> show value
+    threadDelay 100000
+    x <- IO.roundtrip oid encoder decoder value
+    threadDelay 100000
+    putStrLn "done"
+    return x)

the number of successful rountrips before crashing varies wildly, and there's no obvious pattern to the last successful one. E.g.:

$ cabal run tasty -- -p jsonb
Up to date

  Binary format
    jsonb roundtrip: tripping: Array []
done
tripping: Number 0.0
done
tripping: Bool True
done
Segmentation fault: 11

Running the test binary under lldb, I see:

* thread #1, name = 'ghc_ticker', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x000000010017d00b tasty`LcnzW_info + 171
tasty`LcnzW_info:
->  0x10017d00b <+171>: jmpq   *(%rbx)
    0x10017d00d <+173>: movq   %r14, %rbx
    0x10017d010 <+176>: andq   $-0x8, %rbx
    0x10017d014 <+180>: addq   $0x10, %rbp
Target 0: (tasty) stopped.
(lldb) bt
* thread #1, name = 'ghc_ticker', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x000000010017d00b tasty`LcnzW_info + 171
    frame #1: 0x000000420011e9b2
(lldb) 

I haven't figured out how to build with debug symbols for a more useful backtrace.

I've tried this with both ghc 8.10.7 and ghc 9.2.2.

nikita-volkov commented 2 years ago

Thanks a lot for the derailed report! I'll address it in coming days

nikita-volkov commented 2 years ago

It was a test-only issue, which had some expectations about aeson, which were no longer true starting from aeson-2. It's fixed in master now.