elixir-explorer / adbc

Apache Arrow ADBC bindings for Elixir
https://arrow.apache.org/adbc/
Apache License 2.0
45 stars 16 forks source link

feat: support named parameters and bulk inserts #66

Closed cocoa-xu closed 3 months ago

cocoa-xu commented 4 months ago

Database like Google BigQuery also supports named parameters in SQL queries:

To specify a named parameter, use the @ character followed by an identifier, such as @param_name. Alternatively, use the placeholder value ? to specify a positional parameter. Note that a query can use positional or named parameters but not both.

To support named parameters, we need to change elixir_to_arrow_type_struct in adbc_nif.cpp and allow users to pass a map/keyword list:

josevalim commented 4 months ago

I wonder if we should introduce a proper buffer API instead. IIRC, for bulk inserts, each column is an arrow buffer. So maybe we should have: ADBC.Buffer.u8([0, 2, 3]) and so forth.

So the supported arguments would be:

[
  123,                    # int64,  inferred
  "string",               # string,  inferred
  4.56,                   # double, inferred
  true,                   # bool,   inferred
  false,                  # bool,   inferred
  nil,                    # na,     inferred
  %ADBC.Buffer{}
]

Then, for named arguments, we support either maps or keyword lists. Then we can also provide query APIs that return buffers, this way we can easily pass the result of a query to another query. WDYT?

cocoa-xu commented 4 months ago

I wonder if we should introduce a proper buffer API instead. IIRC, for bulk inserts, each column is an arrow buffer. So maybe we should have: ADBC.Buffer.u8([0, 2, 3]) and so forth.

So the supported arguments would be:

[
  123,                    # int64,  inferred
  "string",               # string,  inferred
  4.56,                   # double, inferred
  true,                   # bool,   inferred
  false,                  # bool,   inferred
  nil,                    # na,     inferred
  %ADBC.Buffer{}
]

Then, for named arguments, we support either maps or keyword lists. Then we can also provide query APIs that return buffers, this way we can easily pass the result of a query to another query. WDYT?

Ahh, ADBC.Buffer sounds definitely better! I'll try to implement it and send a PR :)