elixir-explorer / adbc

Apache Arrow ADBC bindings for Elixir
https://arrow.apache.org/adbc/
Apache License 2.0
50 stars 16 forks source link

feat: parse dates, times, and timestamps #30

Closed superhawk610 closed 1 year ago

superhawk610 commented 1 year ago

This PR adds support for some Arrow temporal formats - dates, times, and timestamps. I'm primarily using ADBC with Snowflake, which doesn't support the full set of Arrow temporal formats (no timestamps with timezones, durations, or intervals).

These three formats can be tested using the following query:

db_opts = [driver: :snowflake, ..]
{:ok, db} = Adbc.Database.start_link(db_opts)
{:ok, conn} = Adbc.Connection.start_link(database: db)

query = """
select
  '2023-03-01T10:23:45.123456'::timestamp \"datetime\",
  '2023-03-01'::date \"date\",
  '10:23:45.123456'::time \"time\"
"""

Adbc.Connection.query!(conn, query)
# %Adbc.Result{
#   num_rows: 1,
#   data: %{
#     "date" => [~D[2023-03-01]],
#     "datetime" => [~N[2023-03-01 10:23:45.123456]],
#     "time" => [~T[10:23:45.123456]]
#   }
# }
superhawk610 commented 1 year ago

While working on parsing times, I ran into this Go panic

panic: arrow/array: column "time" type mismatch: got=float64, want=time64[ns]

goroutine 31 [running]:
github.com/apache/arrow/go/v13/arrow/array.NewRecord(0xc000577b00, {0xc0004efa80, 0x4, 0xc0004efa00?}, 0x1)
  /Users/runner/go/pkg/mod/github.com/apache/arrow/go/v13@v13.0.0-20230620164925-94af6c3c9646/arrow/array/record.go:151 +0x173
github.com/apache/arrow-adbc/go/adbc/driver/snowflake.getRecTransformer.func1({0x113ffcf58, 0xc0008563c0}, {0x11401ab10, 0xc000524900})
  /Users/runner/work/arrow-adbc/arrow-adbc/adbc/go/adbc/driver/snowflake/record_reader.go:65 +0x1ad
github.com/apache/arrow-adbc/go/adbc/driver/snowflake.newRecordReader.func2()
  /Users/runner/work/arrow-adbc/arrow-adbc/adbc/go/adbc/driver/snowflake/record_reader.go:283 +0x1c3
golang.org/x/sync/errgroup.(*Group).Go.func1()
  /Users/runner/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
  /Users/runner/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:72 +0xa5
josevalim commented 1 year ago

Hi @superhawk610! I believe the bug you are seeing is this one: https://github.com/apache/arrow-adbc/pull/1021

You can try using v0.7.0-rc and see if it address it. https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-0.7.0

I will add a guide on updating ADBC, but it should be:

  1. Copy root files and c/ directory from ADBC into 3rdparty
  2. Update the driver version: https://github.com/elixir-explorer/adbc/blob/main/lib/adbc_driver.ex#L9
  3. ...
  4. Profit

Thanks for the PR and let me know if there is anything else we can help with. :)

superhawk610 commented 1 year ago

That fixed it! Should I check in the driver version upgrade with this PR?

josevalim commented 1 year ago

@superhawk610 I will do it in another PR. That's because if it is an external contribution, I need to review all code. :) I will ping you in the PR for double checking.

josevalim commented 1 year ago

Btw, can you add tests using sqlite or PG? Or should I push them after merging?

superhawk610 commented 1 year ago

I can add tests early next week!

josevalim commented 1 year ago

:green_heart: :blue_heart: :purple_heart: :yellow_heart: :heart: