r2dbc / r2dbc-spi

Service Provider Interface for R2DBC Implementations
Apache License 2.0
419 stars 56 forks source link

Positional Argument Binding Index #33

Closed robertroeser closed 5 years ago

robertroeser commented 5 years ago

Hi,

One thing I find a bit awkward is arguments in the sql statement and the bind arguments. The arguments in a sql statement are indexed from 1 and the arguments that bound are indexed from 0 ie:

.createStatement("insert into foo values ($1, $2, $3, $4)")
                                .bind(0, v1)
                                .bind(1, v2)
                                .bind(2, v3)
                                .bind(3, v4)
                                .execute()

I think it might nice if the bindings started from same place - so they both start are 0 or 1.

mp911de commented 5 years ago

R2DBC uses native parameter bind markers and drivers have to adhere to rules imposed by the driver. For me, this ticket has three facets:

  1. Does Postgres require bind markers to start with $1 or can it also start with $0. That's something to elaborate in https://github.com/r2dbc/r2dbc-postgresql.
  2. Indexed arguments start with 0 or 1: We've seen API that started with index 1 rather than index 0 and this caused a lot of confusion. R2DBC starts with index zero which feels the natural choice. Drivers that represent their first bind marker with a different number than 0, should still allow indexed binding starting at index zero.
  3. The question that arises from here is the order of indexed parameters. While ordering for Postgres (numbered parameters) seems somewhat clear the actual question is: Which one of the following should apply:
.createStatement("insert into foo values ($2, $1)")
                                .bind(0, v1)
                                .bind(1, v2)
                                .execute()

Does bind marker substitution by index 0 apply to $1 or $2? We will see a similar question arise around named bind markers (e.g. createStatement("insert into foo values (@foo, @bar)") for SQL Server).

odrotbohm commented 5 years ago

What if we started the driver implementation bind markers from 0, too? That would differ from JDBC but at least make a consistent API. We have a chance to start from a clean slate and can learn from design choices that caused confusion with JDBC and improve on those.

nebhale commented 5 years ago

This was one of the reasons that the original API didn't expose binding by index; it was eventually added to allow clients to do positional binding.

I (and the PostgreSQL wire protocol) consider the $# style to be a variant of named bind markers. NOTE that this is a PostgreSQL syntax, not a general substitution syntax. They are passed through in the SQL statement and then substituted in from a zero-indexed (with +1 added to match the names) array on the server. I always recommend to people that if you plan on using $1, $2 etc, you should do the same in the binding:

.createStatement("insert into foo values ($1, $2, $3, $4)")
                                .bind("$1", v1)
                                .bind("$2", v2)
                                .bind("$3", v3)
                                .bind("$4", v4)
                                .execute()

On the other hand, that you've gone straight to positional binding is a good datapoint; it might be that named bindings (including $#) aren't actually the preferred system for users. Given the mayhem that JDBC's one-indexed substitutions resulted in, and the fact that mismatch only shows in PostgreSQL's substitution grammar so far, I'm of the opinion that we want to stick with zero-indexing.

@gregturn I'd expect the H2 driver's binding to be zero-based and then +1 to get to the H2 internal style. Is this accurate?

gregturn commented 5 years ago

If $1, $2, etc. translates into pure position syntax on the database itself, I'd vote we aim for 0-based.

It that is NOT the case, then we should capture this difference in each driver and client's documentation with clear messaging.

mp911de commented 5 years ago

Drivers should generally state how they prefer parameter binding. Drivers should prefer either positional or named binding. For Postgres, we can use both because an integer can translate well to the $# syntax. For SQL Server, we require named binding because SQL server has no notion of parameter indexes.

MySQL supports only positional binding because they are using anonymous bind markers that are identified by their positional occurrence.

See #117 for refinement on the specification.