mwotton / judy

Other
18 stars 4 forks source link

segmentation fault #5

Closed jrraymond closed 6 years ago

jrraymond commented 7 years ago

The program repeatedly creates two Judy arrays of a given size and performs the union of them. The program will seg-fault.

Unioning two Judy arrays of size 10 never segfaults, but as the size goes up past 100, it almost always seg-faults.

import Control.Monad
import qualified Data.Judy as J
import System.Random.Mersenne

union :: J.JudyL Int -> J.JudyL Int -> IO (J.JudyL Int)
union a b = do
  n <- J.new
  ia <- J.freeze a
  ib <- J.freeze b
  as <- J.toList ia
  bs <- J.toList ib
  mapM_ (\(k, v) -> J.insert k v n) as
  mapM_ (\(k, v) -> J.insert k v n) bs
  return n

doUnions :: Int -> IO ()
doUnions sz = do
  g <- getStdGen
  rs <- randoms g
  a <- J.new :: IO (J.JudyL Int)
  forM_ (take sz rs) $ \n -> J.insert n 1 a
  b <- J.new :: IO (J.JudyL Int)
  forM_ (take sz (drop sz rs)) $ \n -> J.insert n 1 b
  c <- union a b
  cSz <- J.size c
  print cSz

main :: IO ()
main = do
  sz <- (read :: String -> Int) <$> getLine
  i <- (read :: String -> Int) <$> getLine
  replicateM_ i (doUnions sz)

I ran valgrind and there are many read after free'd errors. I think maybe is not a problem with Judy itself but maybe the GHC runtime is cleaning up the judy arrays earlier than it should.

Invalid read of size 1
==25115==    at 0x4E55759: JudyLNext (in /usr/lib/x86_64-linux-gnu/libJudy.so.1.0.3)
==25115==    by 0x446107: ??? (...)
==25115==  Address 0x80555cf is 15 bytes inside a block of size 4,096 free'd
==25115==    at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25115==    by 0x4E50A0D: j__udyLFreeJBU (in /usr/lib/x86_64-linux-gnu/libJudy.so.1.0.3)
==25115==    by 0x4E48D9E: j__udyLFreeSM (in /usr/lib/x86_64-linux-gnu/libJudy.so.1.0.3)
==25115==    by 0x4E49051: JudyLFreeArray (in /usr/lib/x86_64-linux-gnu/libJudy.so.1.0.3)
==25115==    by 0x4E68F5: runCFinalizers (Weak.c:30)
==25115==    by 0x4E69B7: scheduleFinalizers (Weak.c:104)
==25115==    by 0x4E858F: GarbageCollect (GC.c:684)
==25115==    by 0x4E0AB8: scheduleDoGC.isra.21 (Schedule.c:1801)
==25115==    by 0x4E16CC: schedule (Schedule.c:548)
==25115==    by 0x4E16CC: scheduleWaitThread (Schedule.c:2496)
==25115==    by 0x4DF3E6: hs_main (RtsMain.c:64)
==25115==    by 0x40A29B: main (in .stack-work/dist/x86_64-linux/Cabal-1.24.2.0/build/judy-exe/judy-exe)
chrisdone commented 6 years ago

I'm suspicious of toList. It "lazily" returns elements. Look at this:

http://hackage.haskell.org/package/judy-0.3.0/docs/src/Data-Judy.html#toList'

That's asking for a segfault.

chrisdone commented 6 years ago

(freeze appears to be the culprit here, which is implemented in terms of toList'.)

chrisdone commented 6 years ago

As suspected, toList' is the reason for the segfault. PR to remove the laziness: https://github.com/mwotton/judy/pull/6/files