r-spatial / sfdbi

DBI interface to sf
https://r-spatial.github.io/sfdbi/
Other
53 stars 0 forks source link

SFSQL and SFSQL_PG driver for lazy_tbl #2

Closed mdsumner closed 2 months ago

mdsumner commented 4 years ago

SFSQL() wraps the basic sql query to read_sf(), passing in the DSN argument in the usual way (file, url, connection string) with a dummy layer value.

SFSQL_PG() adds another level to allow input of host,dbname,user,password which is passed down as the 'PG:...' connection string to SFSQL().

There is a lot I haven't considered, readonly is ignored, security/obfuscation on the connection string is ignored, and no detail on the difference between DB-drivers and non-DB:

i.e. DB drivers have the SQL passed in direct, so it is engine-specific, otherwise for non-DB afaik:

Discussed here: https://github.com/mdsumner/RGDALSQL/issues/7

Other:

Examples, first pseudocode for postgresql

library(sfdbi)   ## mdsumner/sfdbi@ogrsql branch

## PSEUDOCODE
db <- dbConnect(SFSQL_PG(), 
                host = Sys.getenv("host"),
                dbname = "databasename",
                user = Sys.getenv("user"),
                password = Sys.getenv("password")
)

## that is equivalent to 
#db <- dbConnect(SFSQL(), "PG:host='..' dbname='..' user='..' password='..'")

library(dplyr)
tbl(db, "planes")

Examples with SHP and GPKG

gfile <- system.file("gpkg/nc.gpkg", package = "sf", mustWork = TRUE)
sfile <- system.file("shape/nc.shp", package = "sf", mustWork = TRUE)

gpkg <- tbl(dbConnect(SFSQL(), gfile), "nc.gpkg")
gpkg %>% 
  filter(NAME %LIKE% 'A%') %>% 
  mutate(abc = 1.3) %>% 
  select(abc, NAME, geom) %>% 
  arrange(desc(NAME))  ## %>% show_query()

## note geom column *name*
shp <- tbl(dbConnect(SFSQL(), sfile), "nc")
shp %>% 
  filter(NAME %LIKE% 'A%') %>% 
  mutate(abc = 1.3) %>% 
  select(abc, NAME, `_ogr_geometry_`) %>% 
  arrange(desc(NAME))  #%>% show_query()
mdsumner commented 4 years ago

note that non-DB drivers will keep the geometry column whether it's in the query or not

GPKG drops 'geom' as it's not selected, but SHP keeps it, I don't have a strong opinion on this ...

gpkg %>% 
  filter(NAME %LIKE% 'A%') %>% 
  mutate(abc = 1.3) %>% 
  select(abc, NAME)

image

etiennebr commented 4 years ago

Thanks @mdsumner! This looks great. I'll give it a serious look in the next few days.

mdsumner commented 4 years ago

I went and made this PR into a package, so you can either use that or crib the DBI stuff from it when you're ready:

https://github.com/mdsumner/lazysf

If I get more into DBs I might come back and look at the more general case here, but rn I need this to get on with more general outputs from the lazy query.