apache / arrow-adbc

Database connectivity API standard and libraries for Apache Arrow
https://arrow.apache.org/adbc/
Apache License 2.0
329 stars 84 forks source link

[R] Schema stability for adbcsqlite? #1591

Open krlmlr opened 4 months ago

krlmlr commented 4 months ago

What feature or improvement would you like to see?

Modern SQLite offers an option to enforce schema stability. Should this be the default for adbcsqlite?

We're running into DBItest problems: creating an empty table doesn't result in the correct schema when queried.

library(adbi)

con <- dbConnect(adbi("adbcsqlite"))

data <- data.frame(a = "data")
ptype <- data[0, , drop = FALSE]

dbCreateTable(con, "test", ptype)
waldo::compare(dbReadTable(con, "test"), ptype)
#> `old$a` is an S3 object of class <integer64>, a double vector
#> `new$a` is a character vector ()

Created on 2024-03-05 with reprex v2.1.0

lidavidm commented 4 months ago

Yeah, the fundamental problem is the same for all of these: we determine the type by trying to read data, and we don't yet look at the declared type.

paleolimbot commented 4 months ago

Another fundamental problem is that the SQLite driver in its current state is difficult to modify. I think there is a general consensus that we should refactor it to make it easier to implement things like this; however, we haven't gotten there yet.

lidavidm commented 4 months ago

Yes, now that @paleolimbot put in the base driver framework, I'm going to spend some time taking that and pushing it forward with the PostgreSQL/SQLite drivers and hopefully make things on par

That said for SQLite I think this would only need to change the reader code which is more self-contained