vhbit / lmdb-rs

Rust bindings for LMDB
MIT License
114 stars 47 forks source link

API lifetimes and mutability are dangerous #48

Open cmbrandenburg opened 7 years ago

cmbrandenburg commented 7 years ago

Transaction methods are immutable—including state-changing ones—which leads to surprising and dangerous results.

Description of the problem

The problem is that Database methods that change values in the database are not mutable—i.e., they have a &self parameter, not a &mut self parameter.

Take the following program as an example, which obtains an immutable reference to an LMDB value that changes while that reference is held. I would expect this program to not compile.

extern crate lmdb_rs;
extern crate tempdir;

fn main() {

    let tdir = tempdir::TempDir::new("lmdb-test").unwrap();

    let env = lmdb_rs::Environment::new().open(&tdir, 0o666).unwrap();
    let db_handle = env.get_default_db(lmdb_rs::core::DbFlags::empty()).unwrap();

    let tx = env.new_transaction().unwrap();
    let db = tx.bind(&db_handle);

    db.set(&"alpha", &"bravo").unwrap();

    let v: &str = db.get(&"alpha").unwrap();
    assert_eq!(v, "bravo");

    db.set(&"alpha", &"charlie").unwrap(); // <-- should be a compiler borrow-check error
    println!("Got: {:?}", v);
}

Instead, the program compilers and, when run, produces this output:

Got: "arlie"

Suppose the database value had been overwritten with invalid UTF8, in which case we would have had a &str referencing invalid UTF8. This is classic memory corruption, and it occurs without the application using an unsafe block.

Ideas for a fix

Broadly speaking, there are two possible fixes here. The first is to simply mark the database-modifying methods as unsafe so that applications must explicitly mark their use of these methods as being memory-unsafe.

The second fix—and the one that I prefer—is to change the database-modifying methods to be mutable—i.e., to have a &mut self parameter instead of &self. Such a change would force the application in the example above to scope the v reference such that it goes out of scope before the call to Database::set—otherwise the application would trigger a borrow-check error in the compiler.

If, for some reason, an application needs to hold a reference to a key or value while modifying the content of the database, then that application could resort to using raw pointers and unsafe blocks. This is extra work for the application programmer, sure, but it reflects how the underlying operation is fundamentally memory-unsafe. This is idiomatic Rust: memory-unsafe code is wrapped within an unsafe block.

Another idea is to combine these two fixes and for lmdb-rs to provide two related methods: one that's memory-safe and another that's not.

impl Database {
    fn set(&mut self, key: &ToMdbValue, value: &ToMdbValue) -> MdbResult<()> {}
    unsafe fn set_unchecked(&self, key: &ToMdbValue, value: &ToMdbValue) -> MdbResult<()> {}
}

List of affected methods

The following methods are memory-unsafe:

  1. Database::set
  2. Database::append
  3. Database::append_duplicate
  4. Database::insert
  5. Database::del
  6. Database::del_item
  7. Database::clear

Additionally, the following two methods are memory-unsafe—but in a way dissimilar from the problem described in this ticket:

  1. Database::set_compare
  2. Database::set_dupsort

The lmdb-rs documentation explicitly describes these methods as dangerous, but that danger is not propagated through the Rust type system. I'm not sure what should be done about these two methods and am leaving them out of this discussion other than to suggest they should be marked as unsafe.