kim / leveldb-haskell

Haskell bindings to LevelDB (https://github.com/google/leveldb)
BSD 3-Clause "New" or "Revised" License
66 stars 50 forks source link

Enormous heap when iterating over 10million records #35

Closed charlescrain closed 7 years ago

charlescrain commented 7 years ago

I'm getting a very large heap when doing a simple count of all the records in the DB. I have a little over 10 million records in the DB. Below is the code I'm using:

ldbCount :: DB.DB -> ResourceT IO ()
ldbCount db =
  DB.withIterator db
                  def
                  (\ i -> do
                      let stream = DB.keySlice i DB.AllKeys DB.Desc :: DB.Stream (ResourceT IO) DB.Key
                      totalCount <- DB.foldl' (\c _ -> c + 1) (0::Int) stream
                      liftIO . putStrLn $ show totalCount
                  )

Here is the output when using stack exec -- count db_dir +RTS -s

  6,385,123,872 bytes allocated in the heap
  1,770,246,472 bytes copied during GC
     459,767,960 bytes maximum residency (12 sample(s))
         3,930,216 bytes maximum slop
             845 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     11952 colls,     0 par    1.856s   1.752s     0.0001s    0.0063s
  Gen  1        12 colls,     0 par    1.408s   2.037s     0.1697s    0.9557s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time   15.436s  ( 16.567s elapsed)
  GC      time    3.264s  (  3.789s elapsed)
  EXIT    time    0.012s  (  0.080s elapsed)
  Total   time   18.748s  ( 20.435s elapsed)

  %GC     time      17.4%  (18.5% elapsed)

  Alloc rate    413,651,455 bytes per MUT second

  Productivity  82.6% of total user, 81.5% of total elapsed

The database itself is only 2.5G. Could there be a memory leak of some kind that allows the heap to grow so large or have I made a naive assumption in the implementation?

charlescrain commented 7 years ago

There was a mistake with my stack exec command not rebuilding with profiling so I was testing an old version of my program.