kim / leveldb-haskell

Haskell bindings to LevelDB (https://github.com/google/leveldb)
BSD 3-Clause "New" or "Revised" License
66 stars 50 forks source link

Segfault when using an already closed database #15

Closed skogsbaer closed 9 years ago

skogsbaer commented 10 years ago

The following program terminates with a segfault:

import qualified Database.LevelDB.Base as LevDb import qualified Data.ByteString.Char8 as BSC

main =
  do db <- LevDb.open "test.db" (LevDb.defaultOptions {
                                            LevDb.createIfMissing = True
                                          })
     LevDb.close db
     let k = BSC.pack "KEY"
         v = BSC.pack "VALUE"
     LevDb.put db LevDb.defaultWriteOptions k v
     putStrLn "done"

I would expect an exception when doing a put on an already closed database.

I could go ahead and fix the problem. My plan is:

The accessor function is a bit tricky because it has to deal with concurrency. It's not sufficient to allow only one client at a time accessing the connection because leveldb allows concurrent operations on the same connection. But the close function needs exclusive access to the connection and all further operations invoked on the connection should fail afterwards.

I think I can come up with a solution to this problem. But I want to hear your opinions first.

kim commented 10 years ago

I would expect an exception when doing a put on an already closed database.

Yes I agree that a segfault is probably not the best way of letting users know they're doing something wrong ;)

  • Internal.hs no longer exports the constructors of the DB type
  • The DB type contains an additional MVar tracking the state of the connection (open/closed)
  • Functions that now use the LevelDbPtr directly instead use an accessor function for getting the DB handle. This accessor function is implemented in Internal.hs

Hm yes, something like an internal withDB combinator might work.

The accessor function is a bit tricky because it has to deal with concurrency. It's not sufficient to allow only one client at a time accessing the connection because leveldb allows concurrent operations on the same connection. But the close function needs exclusive access to the connection and all further operations invoked on the connection should fail afterwards.

There is one other problem: suppose an operation using the DB handle takes a long time, where meanwhile another thread calls close. Since LevelDB internally just deletes the handle, I would expect a segfault again. I think the proper solution would be to hide the close function entirely and call it in a finalizer when DB is garbage collected.

As a side note, this was actually one of the reasons I initially only exposed the ResourceT-based API, which does just this.