ankane / dbx

A fast, easy-to-use database library for R
Other
187 stars 15 forks source link

named parameters? #22

Closed r2evans closed 3 years ago

r2evans commented 3 years ago

Parameterization is awesome, and downright mandatory in my eyes. However, at least historically I've been frustrated by the inconsistency between DBMSes. I think many are resolving to ?, but I find it very useful to name the parameters. This is external to the DBMS query, and requires the query to be tweaked before sending it off.

I've found that using named bind-parameters can be better in many ways than un-named parameters, in increasing order of importance (to me):

  1. Repeat parameters can be specified once, and not repeated;
  2. Maintainability: it is much easier to read if the named params are not just ?; and
  3. Order of parameters is not important.

For me, the first is a convenience, the second is more than convenience, and the third is a safeguard.

It might enable

dbxSelect2(db, "SELECT * FROM forecasts WHERE period = ?per AND temperature > ?temp",
           params=list(per = "hour", temp = 27))
dbxSelect2(db, "SELECT * FROM forecasts WHERE id IN (?ids)",
           params=list(ids = 1:3))

The way-forward does require that the ? mark is immediately followed by an unambiguous label (no spaces).

This does require changing the query slightly: find the named params, match each with the named argument, then reassign all named params from ?name to ?. It's not that hard pattern-wise, and if it's provided as a separate function (or the same function and an additional argument), then the overhead and risk of query-modification is reduced.

So effectively this becomes a wrapper that takes named-qmarks and a named-list of values; and converts it into unnamed-qmarks and a possibly-reordered, possibly-repeated, possibly-reduced list of values. (I say possible-reduced because this might allow unused arguments to be ignored. Or one might prefer to enforce perfect matches, that argument can be made as well.)

ankane commented 3 years ago

Hey @r2evans, thanks for another suggestion! It makes sense, but for simplicity, don't think I want to support another way to pass parameters right now (dbx currently uses ? for all adapters).

krlmlr commented 3 years ago

What happens if the query contains a literal '?' ?

You can use DBI::sqlInterpolate() and get escaping and named parameters for free. I'll release a fix with r-dbi/DBI#329 when CRAN opens.

Have you considered using dbBind() (or passing the parameters as params argument to dbSendQuery()? If you're rewriting queries anyway, you might as well rewrite the placeholders to the backend-specific syntax, which gives you the advantages of parametrized queries. This is the essence of https://github.com/r-dbi/DBI/issues/52, which has been open for too long and which I now think is out of scope for DBI.

r2evans commented 3 years ago

A literal question mark string is not a problem as long as the parsing of the string tokenizes it, seeing '?' instead of ?.

I've been doing this in a side package of mine (private, not in GitHub), and have been using it consistently with sql server, postgres, and sqlite alike.

krlmlr commented 3 years ago

Is dbx parsing the strings, or looking for ? ?

r2evans commented 3 years ago

@ankane, one thought on why I think named parameters are a good thing to support and a better thing for dev and production: for readability and clarity, do you prefer sprintf or the concept of glue? Python has similar concepts with its "".format(), dictionary string expansion "" % (,,), and most recently its f-strings, and I think there is great value in shifting the mindset away from positional bindings to named bindings.

My suggestion is not necessarily to replace ? altogether, though supporting both requires some extra safeguards. If for example

dbSelect <- function(..., named = FALSE) { ... }

then nothing is changed. I'd think strictly "all-or-nothing" would be best.

This isn't supported in DBI, as that is as low-level as it needs to be, and changes to the query (which this suggestion does require) should probably be avoided at that level. The next place for this to be an option is in an extension package. I argue that dbX is the place for this, and is in a good position to encourage good/safe practices.

At the risk of being a little dramatic, I suggest that named-parameters is one of these "good practices" that should be encouraged to new users (well, and experienced users, too). Whenever I onboard somebody into SQL, the first thing I do is discuss the tripping-points of positional-only parameters. I think dbX is well positioned to enable that encouragement (not required it).

Just my $0.02, thanks.

ankane commented 3 years ago

Hey @r2evans, I understand that it can be useful in some situations, but don't want to support a second way of passing parameters right now. I appreciate the suggestion.