Drup / ocaml-lmdb

Ocaml bindings for lmdb.
https://drup.github.io/ocaml-lmdb/dev/
MIT License
48 stars 3 forks source link

Bunch of misc changes #18

Closed Drup closed 4 years ago

Drup commented 5 years ago

Contains #17 .

I started to make some misc improvements, both on the API and the non-FFI internals. Moving to records seems to give a minor speedup, and it certain reduces the boilerplate. I'm planning to do more improvements, notably wrt permissions.

madroach commented 5 years ago

Converters

Implementation

While evaluating the different options for the interface I implemented and benchmarked different approaches:

  1. A map functor
  2. A polymorphic Map module which takes plain records as converters
  3. A polymorphic Map module which takes first-class modules as converters
  4. A polymorphic Map module which takes a pair of serialiser and deserialiser as converter and does some black magic to figure out the appropriate flags. Approaches 1-3 all had the appropriate flags in the converter, which is the saner approach.

1 is very slightly faster and would match what the Stdlib does with Map and Hashtbl functors. But I don't like the functor, because it forces the user to create and name a separate module for each map while there is not much to gain from it. There is no real difference between 2 and 3. Neither performance-wise nor implementation-wise. Implementation- and performance-wise we should probably flatten the serialisation and deserialisation functions into the map record anyway. Keeping the flags would be redundant.

Interface

So the real question is about the interface. What we need in the interface is a type to represent a converter which is composed of a type, a flags value and serialising / deserialising functions. I see three options

  1. Opaque type with constructor and accessor functions
  2. Expose converters as polymorphic records
  3. Expose first-class modules

Constructor and accessor functions can be defined for any type, be it opaque or exposed. So what do we have to gain from an opaque type? The usual answer is that it will give flexibility in future changes of the implementation. But since interface and implementation are already mostly decoupled I see no benefit in making converters opaque. That's why I prefer an exposed type. I chose first-class modules because they will at least partly match Stdlib's approach to Maps and Hashtbls and are the natural type to represent a type with associated values and functions. In the end, records are the same, but with first-class functions.

Benchmark

This is the output of bench.exe 200:

type MISC (records) MASTER (first-class modules)
string 90.1+-0.9/s 90.0+-0.8/s
int32_le 95.6+-0.8/s 95.3+-0.8/s
int32_be 96.0+-0.9/s 95.8+-0.8/s
int64_be 99.0+-0.9/s 98.5+-0.9/s
int64_le 99.1+-1.1/s 98.6+-1.0/s
Drup commented 5 years ago

Well, using first class modules in the internals incurs some serious type-y boilerplate, which I'm not fond of.

I've seen various APIs that started as you did, with an exposed typed because "it's decoupled, it's fine, we can still change the internals later" and paint themselves in a corner when the time came to actually change the API. So I prefer the opaque version. It also avoids the duplications between packed and unpacked modules.

madroach commented 5 years ago

Well, using first class modules in the internals incurs some serious type-y boilerplate, which I'm not fond of.

Me too. But the question I'm asking is about the interface, not the internals.

it's fine, we can still change the internals later" and paint themselves in a corner when the time came to actually change the API.

I think I get your point. You propose something like this I suppose:

type 'a t
val create :
        ~deserialise:(bigstring -> t)
        ~serialise:((int -> bigstring) -> t -> bigstring)
        ?flags:Flags.t
val serialise : t -> (int -> bigstring) -> t -> bigstring
val deserialise : t -> (int -> bigstring) -> t -> bigstring
val flags : t -> Flags.t

This looks rather nice and more lightweight than I thought. Something like this I would like and even prefer to first-class modules :smile: And your point was that this is extensible while maintaining backward-compatibility where first-class modules would break backward compatibility when additional fields are added to the underlying type?

Drup commented 5 years ago

Yes, exactly!

Drup commented 4 years ago

@madroach I think we ended up integrating all these changes, one way or another ? We should close.