agrafix / superrecord

Haskell: Supercharged anonymous records
BSD 3-Clause "New" or "Revised" License
83 stars 16 forks source link

!!! `project` causes memory inconsistencies #38

Open Profpatsch opened 2 years ago

Profpatsch commented 2 years ago

project (and potentially also inject, we haven’t verified`, will cause undefined behavior, which we noticed & verified in a production system.

The bug only appears with optimizations enabled (only tested -O2), not in ghci

I don’t have a minimal repro (yet), but I can describe what we did:

There was one function which loaded some data in IO, constructed a ist of rec, then before returning used project to reduce unnecessary fields in every list element:

loadData :: IO (Record '[ "foo" := Text ])
loadData = do
  listOfRes <- <fetch data>
  pure 
    $ map 
         (project @'["foo" := Text, "bar" := Int] @'["foo" := Text])
         listOfRes

And another function that pulled a subset of fields from the record, like this:

requestBody = do
  dat <- loadData

  let mapDat d = (Json.object [
         ("foo", get #foo d)
        ])

   pure $ Json.Array (map mapDat dat)

Let’s say the final data was

[ 
  { foo = "xx" },
  { foo = "yy" },
  { foo = "zz" },
]

Then what would be actually returned was (!!):

[ 
  { foo = "xx" },
  { foo = "xx" },
  { foo = "xx" },
]

now, if you rewrote the function to deeply evaluate the projected record:

loadData :: IO (Record '[ "foo" := Text ])
loadData = do
  listOfRes <- <fetch data>

  print listOfRes   -- <- deeply evaluate the record by printing it

  pure 
    $ map 
         (project @'["foo" := Text, "bar" := Int] @'["foo" := Text])
         listOfRes

It would again return

[ 
  { foo = "xx" },
  { foo = "yy" },
  { foo = "zz" },
]

When we rewrote the original function to construct a new record instead of projecting:

loadData :: IO (Record '[ "foo" := Text ])
loadData = do
  listOfRes <- <fetch data>
  pure 
    $ map 
         (\res -> rcons #foo (get #foo res) rnil)
         listOfRes

The problem went away.

Profpatsch commented 2 years ago

cc @sheaf who wrote project a while ago

sheaf commented 2 years ago

I don't recall the exact details, but I think I saw an issue like this start happening with GHC 9.0 or 9.2. The issue went away with -fno-full-laziness. What version of GHC are you using, and does the problem still happen with -fno-full-laziness?

Profpatsch commented 1 year ago

Yes, that fixes it … but why, that’s horrifying

Profpatsch commented 1 year ago

How can every entry in a list of records pulled out of IO be changed by laziness floating.

sheaf commented 1 year ago

The relevant GHC issue is #19413 (although that is for runRW# not runST). See in particular my comment. This started happening with GHC 9.0.

Profpatsch commented 1 year ago

I’m trying to reproduce the problem in the test suite, but I don’t know how to trigger the mis-optimization

https://github.com/agrafix/superrecord/pull/39/commits/085540be48ee8ef768878f9ce5f99307cfa087ef

Profpatsch commented 1 year ago

Also cabal test flat-out refuses to use the -O2 from the cabal file, you have to run cabal test -O2

Profpatsch commented 1 year ago

@sheaf Do you by any chance have any time next week to create a minimal reproduction example for the GHC bug tracker?

Profpatsch commented 1 year ago

@sheaf I meant doing a video call to figure one out based on the unsafePerformIO example @ndmitchell gives in https://gitlab.haskell.org/ghc/ghc/-/issues/19413

sheaf commented 1 year ago

@Profpatsch Could you try replacing, purely in this library,

runST' :: (forall s. ST s a) -> a
runST' !s = runST s

with

import GHC.ST (ST(..))
import GHC.Exts (runRW#, lazy)

runST' :: (forall s. ST s a) -> a
runST' !(ST st) = case runRW# st of (# _, a #) -> lazy a

If that works, I think that would be good evidence we need the same lazy trick for runST.