Breaking API improvements for better error handling and type safety

joeyadams commented 12 years ago

This extends pull request #12. Once #12 is merged, the number of commits here should go down.

This pull request makes the following breaking changes:

Newtypes ParamIndex and ColumnIndex are used (instead of Int) to improve signatures and guard against misuse.
When SQLite3 reports an error, a SQLError is thrown instead of an IOError. This way, users can handle specific errors.

This does not change the error message displayed when a query fails, except that it no longer says "user error (...)".
Fields in SQLData are strict now.
reset and finalize no longer throw exceptions. See commit 85f36c7.

It also adds a few new functions: exec, clearBindings, and bindSQLData. bindSQLData binds a single value; I decided not to rename bind after all (see commit 697c36e).

One change that I did not make yet, which we might want to consider, is using Text instead of String for query strings and such. This might be a little less convenient to use (you have to use {-# LANGUAGE OverloadedStrings #-}), but it means less conversions for sqlite-simple, persistent-sqlite, and direct-sqlite itself.

nurpax commented 12 years ago

I like all of the changes except for the ParamIndex/ColumnIndex thing which seems unnecessary to me. I'd just keep using Int for these indices. I don't see the old use of Ints for these being particularly dangerous.

How strongly do you feel about the need for that?

My only other comment is that since this is a big change, how did you test this? The existing test suite that I wrote is quite minimal and was written to mostly test stuff I've added recently (like bindParameterCount etc.). It might not be a bad idea to add a few sanity tests for the public API to see that your refactoring didn't break things.

nurpax commented 12 years ago

On testing.. My sqlite-simple test suite might be useful for some additional test coverage. But writing directed tests on the Database.SQLite3 API shouldn't be too much work either..

joeyadams commented 12 years ago

Thanks for the review, @nurpax.

My only other comment is that since this is a big change, how did you test this?

Most of my testing was through GHCi. This helped me understand the behavior of parameter indices, which I documented.

The existing test suite that I wrote is quite minimal and was written to mostly test stuff I've added recently (like bindParameterCount etc.). It might not be a bad idea to add a few sanity tests for the public API to see that your refactoring didn't break things.

Agreed. I'll try to add some soon.

I like all of the changes except for the ParamIndex/ColumnIndex thing which seems unnecessary to me. I'd just keep using Int for these indices. I don't see the old use of Ints for these being particularly dangerous.

How strongly do you feel about the need for that?

These provide two benefits:

Clearer type signatures

It's easy to see that the bind family always takes a parameter index (one-based) as the second argument, and that the column family always takes a column index (zero-based) as the second argument.

This benefit could also be had if we used type aliases instead of newtypes. In any case, I'm not keen on going back to this:

bindInt :: Statement -> Int -> Int -> IO ()

More compile-time safety

This makes it less likely for users to accidentally mix up parameters. However, the real payoff is within direct-sqlite itself. ParamIndex and ColumnIndex are used all the way down to the level of foreign imports:

foreign import ccall "sqlite3_bind_parameter_count"
    c_sqlite3_bind_parameter_count :: Ptr CStatement -> IO ParamIndex
...
foreign import ccall "sqlite3_bind_int64"
    c_sqlite3_bind_int64    :: Ptr CStatement -> ParamIndex -> Int64 -> IO CError
...
foreign import ccall "sqlite3_column_int64"
    c_sqlite3_column_int64  :: Ptr CStatement -> ColumnIndex -> IO Int64

This way, the compiler ensures we do not mix up indices between the FFI bindings and our API. All we have to prove is that:

The foreign imports are correct (by comparing the FFI signatures to the SQLite3 C API documentation).
The four uses of fromIntegral in direct-sqlite3 are justified.

Let's look at one of those fromIntegral uses, one which might be more representative of what users would encounter:

bind :: Statement -> [SQLData] -> IO ()
bind statement sqlData = do
    nParams <- fromIntegral <$> bindParameterCount statement
    when (nParams /= length sqlData) $
        fail ("mismatched parameter count for bind.  Prepared statement "++
              "needs "++ show nParams ++ ", " ++ show (length sqlData) ++" given")
    zipWithM_ (bindSQLData statement) [1..] sqlData

This looks rather inconvenient. But it's also a red flag, to call into question whether converting between a parameter index and a list length makes sense.

The reason this is correct is not as trivial as it looks. bindParameterCount doesn't actually return a count:

This routine actually returns the index of the largest (rightmost) parameter. For all forms except ?NNN, this will correspond to the number of unique parameters. If parameters of the ?NNN form are used, there may be gaps in the list.

On the other hand, it does return a count. It returns the highest parameter number, or 0 if the query has no parameters. If you try to bind a parameter N that isn't used, it silently succeeds, assuming 1 <= N <= bindParameterCount. But if N is < 1 or > bindParameterCount, you get ErrorRange.

joeyadams commented 12 years ago

By the way, I built sqlite-simple against the new direct-sqlite. I only had to add one fromIntegral, with the same exact rationale as bind.

  stmtParamCount <- Base.bindParameterCount stmt
  when (length qp /= fromIntegral stmtParamCount) (throwColumnMismatch qp stmtParamCount)

After this minor modification, sqlite-simple's test suite passes.

nurpax commented 12 years ago

Fair points, I'm ok with your proposal.

joeyadams commented 12 years ago

Thanks for encouraging me to write tests. I've already spotted two flaws, and I haven't even written any tests yet!

open leaks a sqlite3 object on failure. From the documentation of sqlite3_open:

A database connection handle is usually returned in ppDb, even if an error occurs. The only exception is that if SQLite is unable to allocate memory to hold the sqlite3 object, a NULL will be written into ppDb instead of a pointer to the sqlite3 object.
prepare, when given an empty query string, "succeeds", returning Statement nullPtr. Think we should make it throw an exception instead?

joeyadams commented 12 years ago

These commits fix the issues with open and prepare described in my last post, and add a bunch of tests. See the commit messages for a more concise listing of properties tested.

Some of these tests may be testing undefined behavior (e.g. calling columnCount on a newly-prepared statement). I would consider this a good thing; if we upgrade sqlite3.c in the future and get failing tests, then we can warn users about what to avoid.

nurpax commented 12 years ago

Good stuff! I didn't review all the test code but it looked good in general.

(Btw, this is a lot of commits to review separately. I won't complain if in the future you decide use interactive rebase to squash some of these together. ;))

Would be great to get some of your changes in so that you can continue in separate pull requests.

IreneKnapp commented 12 years ago

Same rationale as last one - if it does what it says, it's fine. I had to chew on the newtypes but I'm okay with them.

IreneKnapp commented 12 years ago

Will have to do it tonight though, can't do it on the web for whatever reason, probably needs a rebase.

joeyadams commented 12 years ago

I pushed a merge commit with the new master. Accepting the pull request will probably work now.

The main reason I did a merge instead of a rebase is that a rebase would change all the commit hashes, breaking my "see commit 697c36e" links.

IreneKnapp / direct-sqlite

Breaking API improvements for better error handling and type safety #13

Clearer type signatures

More compile-time safety