composewell / streamly

High performance, concurrent functional programming abstractions
https://streamly.composewell.com
Other
861 stars 66 forks source link

Unreproducible test failure #2671

Closed harendra-kumar closed 2 months ago

harendra-kumar commented 9 months ago

First seen at commit 350b6acae :

Failures:

  Streamly/Test/Unicode/Stream.hs:177:9: 
  1) Unicode.Stream, UTF8 - Encoding / Decoding, Arrays Streamly.Data.String.lines == Prelude.lines
       Assertion failed (after 584 tests):
         [111,74,18,114,114,246,44,86,35,209,255,112,6,60,174,59,132,154,226,252,199,212,251,28,42,66,99,32,168,104,253,69,83,11,204,31,196,230,108,29,70,236,204,117,66,71,126,154,46,228,238,157,11,39,251,207,151,177,9,109,64,112,173,11,125,162,19,108,201,110]

  To rerun use: --match "/Unicode.Stream/UTF8 - Encoding / Decoding/Arrays Streamly.Data.String.lines == Prelude.lines/" --seed 646896010

Randomized with seed 646896010
agentm commented 8 months ago

I'm not certain that we are encountering the same issue with streamly 0.10.0 and curryer-rpc, but after upgrading from streamly 0.9.0, we are seeing reproducible corruption on TCP communications with streamly-bytestring 0.2.1 (so that we can marshal using winery). We are reverting back to streamly 0.9.0 in the meantime.

Our automated tests failed to detect the corruption. Unfortunately, after some attempts, we still can't get the tests to fail, but project-m36 indicates corruption while connecting to localhost TCP.

$ cabal run tutd -- -n db
Project:M36 TutorialD Interpreter 0.9.8
Type ":help" for more information.
A full tutorial is available at:
https://github.com/agentm/project-m36/blob/master/docs/tutd_tutorial.markdown
TutorialD (master/main): :showrelvars
┌─────────────────────────────────────────────────┬──────────┐
│attributes::relation {attribute::Text,type::Text}│name::Text│
├─────────────────────────────────────────────────┼──────────┤
│┌───────────────┬──────────┐                     │"!�"      │
││attribute::Text│type::Text│                     │          │
│└───────────────┴──────────┘                     │          │
│┌──────────┬─────────┐                           │"    "       │
││�e�   ::Text│*��::Text│                           │          │
│└──────────┴─────────┘                           │          │
└─────────────────────────────────────────────────┴──────────┘

Actually, I don't understand how only the Texts are corrupted, but perhaps it's just coincidence.

The only diffs we needed to upgrade from streamly 0.9.0 to streamly 0.10.10 involved module imports. Let me know what additional details I can provide.

harendra-kumar commented 2 months ago

@agentm we missed this update. Are you still able to reproduce this? This could be due to the unpinned arrays change. Are you using the Array internals/internal APIs for some reason?

harendra-kumar commented 2 months ago

@adithyaov I see this code in streamly-bytestring:

-- | Convert an array of 'Word8' to a 'ByteString'. This function unwraps the
-- 'Array' and wraps it with 'ByteString' constructors and hence the operation
-- is performed in constant time.
{-# INLINE fromArray #-}
fromArray :: Array Word8 -> ByteString
fromArray (Array {..})
    | aLen == 0 = mempty
    | otherwise = CONSTRUCTOR((makeForeignPtr arrContents arrStart), 0, aLen)

    where

    aLen = arrEnd - arrStart

Why are we not checking whether the array is pinned here?

When we supported streamly-core-0.2.0 we were supposed make a corresponding change to streamly-bytestring to ensure that the arrays are pinned before we convert to bytestring, where is that change?

Am I missing something here?

harendra-kumar commented 2 months ago

@agentm You can fix the problem by changing Arr.writeN to Arr.pinnedWriteN or Arr.pinnedCreateOf:

envelopeP :: BParser Envelope
envelopeP = do
  let lenPrefixedByteStringP = do
        c <- fromIntegral <$> word32be
        --streamly can't handle takeEQ 0, so add special handling
--        traceShowM ("envelopeP payload byteCount"::String, c)
        if c == 0 then
          pure mempty
          else do
          ps <- P.takeEQ c (Arr.writeN c)
--          traceShowM ("envelopeP read bytes", c)
          let !bs = StreamlyBS.fromArray ps
--          traceShowM ("unoptimized bs")
          pure bs 

StreamlyBS.fromArray was supposed to catch the problem but apparently it is not.

harendra-kumar commented 2 months ago

@adithyaov we should disallow streamly-core-0.2 or later in all versions of streamly-bytestring and upload a new version of streamly-bytestring to address the problem.

agentm commented 2 months ago

After updating to streamly-bytestring 0.2.2, the problem is resolved, so this pinned/unpinned storage was likely the issue. Thanks!

harendra-kumar commented 2 months ago

Duplicate issue. Same as #2699 .