Optimize FFI: Use unsafe FFI calling convention for column data accessors

nurpax commented 11 years ago

FFI calling convention for C functions defaults to 'safe' which incurs a heavy per-call overhead. Use of 'unsafe' should be fine for the column data accessors as they do not callback to Haskell from C. Furthermore, no I/O should be happening in these accessors, all these functions do is move data over from C land to Haskell.

Performance delta in a synthetic benchmark such as https://github.com/nurpax/db-bench is big -- almost 3x. The simplest benchmark case speeds up from 2.4M rows/sec to 7.2M rows/sec. The nice thing about this is that direct-sqlite now also beats Python sqlite3 module by a factor of 2.8x whereas before they were about the same speed.

nurpax commented 11 years ago

This is a result of various work that was done with @Emm on sqlite-simple. @joeyadams might want to review this change too.

I think this is a safe change - I wanted to put this up for review while I dig up more information about the "unsafe" calling convention.

You will see I didn't make all functions into the C bindings unsafe. I think most of them could be unsafe, but it might be better to use it sparingly so that future changes have less chance to break things.

nurpax commented 11 years ago

I found this article about unsafe: http://blog.ezyang.com/2010/07/safety-first-ffi-and-threading/

The kneejerk reaction is to panic about "unsafe" but reading Simon Marlow's first comment suggests that unsafe definitely has its uses.

joeyadams commented 11 years ago

Thanks for backing up this change with a benchmark.

However, sqlite3_step should definitely not be marked unsafe. It does the heavy lifting for the query. If the query takes a long time to return a row (e.g. SELECT sum(x) FROM foo), it could block the whole RTS during the operation.

Strictly speaking, some of the column accessors shouldn't be marked unsafe either, since they take a mutex lock. If one thread holds the sqlite3 mutex for a long time, then another thread does an unsafe sqlite3_column_type call, that could block the whole RTS until the first thread releases the lock. Fortunately, this is only a problem if the caller accesses the Database object from multiple threads, and does long-running queries with the same object. That wouldn't be a good idea anyway.

Functions that should definitely be marked safe, in my opinion:

sqlite3_step: blocks during query operation
sqlite3_exec: executes a sequence of queries, and may call back into Haskell
sqlite3_open, sqlite3_prepare_v2, sqlite3_close: not called enough to be worth the risk.

I think we can make the rest unsafe.

nurpax commented 11 years ago

Oh, I actually didn't mean to make sqlite3_step unsafe - it accidentally slipped in this change. I'll re-run all my benchmarking without it and report back here.

Using the same Database object from multiple threads sounds like a bit of a bad idea anyway. This may open up opportunities for nasty bugs like trying to use a single prepared statement from multiple threads - this is not safe.

Agreed on your list of "definitely safe" functions.

I'll update this PR.

nurpax commented 11 years ago

There's definitely a difference between safe & unsafe step.

safe:

benchmarking sqlite-simple: SELECT Ints mean: 6.278979 ms, lb 6.167044 ms, ub 6.410193 ms, ci 0.950 std dev: 622.3236 us, lb 521.5497 us, ub 758.5229 us, ci 0.950 found 5 outliers among 100 samples (5.0%) 5 (5.0%) high mild variance introduced by outliers: 78.982% variance is severely inflated by outliers

unsafe:

benchmarking sqlite-simple: SELECT Ints mean: 4.222646 ms, lb 4.125593 ms, ub 4.390202 ms, ci 0.950 std dev: 637.1770 us, lb 438.5706 us, ub 1.152351 ms, ci 0.950 found 1 outliers among 100 samples (1.0%) 1 (1.0%) high severe variance introduced by outliers: 90.432% variance is severely inflated by outliers

It's still around 2x faster than what we started with, so a good win anyway.

nurpax commented 11 years ago

Force push updated the change (above).

From the commit message, updated perf results:

Performance delta in a synthetic benchmark such as
https://github.com/nurpax/db-bench is significant.  Benchmarking the
direct-sqlite 'columns' call, we can see an improvement from 1.2M
rows/sec to 2.9M rows/sec.  The nice thing about this is that
direct-sqlite now also beats Python sqlite3 module by about 50%.

Ran both direct-sqlite and sqlite-simple unit test suites on this change too, both passed.

P.S. As further incentive to merge this PR, I'm happy to inform reviewers that in my benchmark, direct-sqlite is now 34x faster than hdbc-sqlite. :)

IreneKnapp commented 11 years ago

Whoa, a lot of conversation. There's a point I'd like to raise before I accept this change, although I'm inclined to do so anyway once it's been discussed.

Okay, so at present, we don't expose the portion of the sqlite API that lets us define custom SQL functions. But in principle, there could be arbitrary callbacks happening, which would indeed be C calling into Haskell. Now, it is my suspicion that these are all taken care of by sqlite3_step, which I see from the above discussion is kept safe, with the actual column accessors only reading out already-computed values. It would be great to have some confirmation of that though.

No, you know what, I'm going to merge this now and do a release as well. Everyone's done good work on the benchmarking and stuff, and I don't think the above is a serious concern. Still I'm leaving it in because I'd like there to be some record of it.

IreneKnapp / direct-sqlite

Optimize FFI: Use unsafe FFI calling convention for column data accessors #20