r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.35k stars 299 forks source link

st_write() not writing spatial metadata to sqlite database #2338

Closed bradleyswilson closed 9 months ago

bradleyswilson commented 9 months ago

I'm encountering some strange behavior with st_write() that I can't seem to sort out.

I can create/write to a spatialite database with st_write(), but it doesn't seem to be populating the proper geometry columns in the spatial metadata. As a result, other software doesn't recognize the data as valid geometries.

Strangely, reading the data back in with st_read() works just fine. Any ideas?

I've attached a simple reproducible example below/

library(sf)
library(DBI)

# new test database
con <- dbConnect(RSQLite::SQLite(), 
                 dbname = 'test.sqlite',
                 loadable.extensions=TRUE,
                 extended_types=TRUE)

# path to spatialite library
dbExecute(con, "SELECT InitSpatialMetadata(1);")

point_data <- data.frame(
  id = 1:5,
  category = c("A", "B", "A", "B", "A"),
  lon = c(-99.1332, -99.1325, -99.134, -99.135, -99.136),
  lat = c(19.4326, 19.4328, 19.433, 19.4335, 19.434)
)

st_write(points_sf, dsn=con, layer="test", append=FALSE)

# this returns 0 rows
dbGetQuery(con, "SELECT * FROM geometry_columns WHERE f_table_name = 'test';") 
edzer commented 9 months ago

I get

library(sf)
# Linking to GEOS 3.12.1, GDAL 3.8.3, PROJ 9.3.1; sf_use_s2() is TRUE
library(DBI)

# new test database
con <- dbConnect(RSQLite::SQLite(), 
                 dbname = 'test.sqlite',
                 loadable.extensions=TRUE,
                 extended_types=TRUE)

# path to spatialite library
dbExecute(con, "SELECT InitSpatialMetadata(1);")
# Error: no such function: InitSpatialMetadata
# Execution halted
bradleyswilson commented 9 months ago

Do you have spatialite extension installed? That error suggests SQLite isn't finding the local spatialite extension.

With RSQLite.spatialite installed:

con <- dbConnect(RSQLite::SQLite(), 
                 dbname = 'test.sqlite',
                 loadable.extensions=TRUE,
                 extended_types=TRUE)

RSQLite.spatialite::init_spatialite(con)

dbExecute(con, "SELECT InitSpatialMetadata(1);")
edzer commented 9 months ago

That won't install; trying remotes::install_github("pschmied/RSQLite.spatialite") I get

Installing package into ‘/home/edzer/R/x86_64-pc-linux-gnu-library/4.3’
(as ‘lib’ is unspecified)
* installing *source* package ‘RSQLite.spatialite’ ...
** using staged installation
** libs
using C compiler: ‘gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’
gcc -I"/usr/share/R/include" -DNDEBUG -DTHREADSAFE=0 -I'/home/edzer/R/x86_64-pc-linux-gnu-library/4.3/RSQLite/include'    -DVERSION=\"3.0.1\" -D_LARGE_FILE=1 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE=1 -DSQLITE_ENABLE_RTREE=1 -DNDEBUG=1 -DSPL_AMALGAMATION=1 -DOMIT_ICONV    -DOMIT_FREEXL  -fpic  -g -O2 -ffile-prefix-map=/build/r-base-H0vbME/r-base-4.3.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2  -c spatialite.c -o spatialite.o
spatialite.c:106:10: fatal error: proj_api.h: No such file or directory
  106 | #include <proj_api.h>
      |          ^~~~~~~~~~~~
compilation terminated.
make: *** [/usr/lib/R/etc/Makeconf:191: spatialite.o] Error 1
ERROR: compilation failed for package ‘RSQLite.spatialite’

proj_api.h is very outdated.

bradleyswilson commented 9 months ago

Ah, dang, was trying to provide an easier way to reproduce. On my end I run this command against my spatialite lib location to get spatialite loaded properly

dbExecute(con, paste0("SELECT load_extension('",path_to_spatialite_lib,"');"))

I can close if you can't get a valid spatialite install working on your end, I'm not sure how to get to the core of my issue without it.

rsbivand commented 9 months ago

Please provide the output of sessionInfo() - which platform are you using? What are versions of packages in play? RSQLite.spatialite has not been touched since 2013, so should only be used with SQLite and compilers of that vintage. On Windows, GDAL is built with spatialite, so using that driver should work. My guess would be that your problems in sharing files with others is because your spatialite interface is outdated. Much other geospatial software uses GDAL, so writing with GDAL rather than unsupported outdated software is more likely to work.

On Windows:

> sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] C

time zone: Europe/Oslo
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sf_1.0-15

loaded via a namespace (and not attached):
 [1] utf8_1.2.4         R6_2.5.1           tidyselect_1.2.0   e1071_1.7-14      
 [5] magrittr_2.0.3     glue_1.7.0         tibble_3.2.1       KernSmooth_2.23-22
 [9] pkgconfig_2.0.3    generics_0.1.3     dplyr_1.1.4        lifecycle_1.0.4   
[13] classInt_0.4-10    cli_3.6.2          fansi_1.0.6        grid_4.3.2        
[17] vctrs_0.6.5        DBI_1.2.1          proxy_0.4-27       class_7.3-22      
[21] compiler_4.3.2     tools_4.3.2        pillar_1.9.0       Rcpp_1.0.12       
[25] rlang_1.1.3        units_0.8-5       
> 

With the specific driver variants (https://gdal.org/drivers/vector/sqlite.html#vector-sqlite):

> library(sf)
Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
> nc <- st_read(system.file("gpkg/nc.gpkg", package="sf"))
Reading layer `nc.gpkg' from data source 
  `C:\Users\RB\AppData\Local\R\win-library\4.3\sf\gpkg\nc.gpkg' 
  using driver `GPKG'
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS:  NAD27
> fn <- tempfile(fileext=".sqlite")
> st_write(nc, fn, driver="SQLite")
Writing layer `file19145ea92ef5' to data source 
  `C:\Users\RB\AppData\Local\Temp\Rtmpkdo7D1\file19145ea92ef5.sqlite' using driver `SQLite'
Writing 100 features with 14 fields and geometry type Multi Polygon.
> st_read(fn)
Reading layer `file19145ea92ef5' from data source 
  `C:\Users\RB\AppData\Local\Temp\Rtmpkdo7D1\file19145ea92ef5.sqlite' 
  using driver `SQLite'
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS:  NAD27
> fn1 <- tempfile(fileext=".sqlite")
> st_write(nc, fn1, driver="SQLite", dataset_options=c("METADATA=YES"))
options:        METADATA=YES 
Writing layer `file19142cfb61d9' to data source 
  `C:\Users\RB\AppData\Local\Temp\Rtmpkdo7D1\file19142cfb61d9.sqlite' using driver `SQLite'
Writing 100 features with 14 fields and geometry type Multi Polygon.
> st_read(fn1)
Reading layer `file19142cfb61d9' from data source 
  `C:\Users\RB\AppData\Local\Temp\Rtmpkdo7D1\file19142cfb61d9.sqlite' 
  using driver `SQLite'
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS:  NAD27
> fn2 <- tempfile(fileext=".sqlite")
> st_write(nc, fn2, driver="SQLite", dataset_options=c("SPATIALITE=YES"))
options:        SPATIALITE=YES 
Writing layer `file19143643784' to data source 
  `C:\Users\RB\AppData\Local\Temp\Rtmpkdo7D1\file19143643784.sqlite' using driver `SQLite'
Writing 100 features with 14 fields and geometry type Multi Polygon.
> st_read(fn2)
Reading layer `file19143643784' from data source 
  `C:\Users\RB\AppData\Local\Temp\Rtmpkdo7D1\file19143643784.sqlite' 
  using driver `SQLite'
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS:  NAD27
> 

Here are the three created files: Rtmpkdo7D1.zip Please consider checking whether your collaborators can open these correctly.

rsbivand commented 9 months ago

SQLite with METADATA is supported in the macOS CRAN sf binary, but not with SPATIALITE, as GDAL was not built against spatialite. macOS_sqlites.zip

bradleyswilson commented 9 months ago
sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/POSIX/en_US.UTF-8/en_US.UTF-8

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DBI_1.2.0 sf_1.0-15

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5        cli_3.6.2          rlang_1.1.3        KernSmooth_2.23-22 generics_0.1.3     glue_1.7.0        
 [7] bit_4.0.5          e1071_1.7-14       hms_1.1.3          fansi_1.0.6        grid_4.3.2         classInt_0.4-10   
[13] tibble_3.2.1       fastmap_1.1.1      lifecycle_1.0.4    memoise_2.0.1      compiler_4.3.2     dplyr_1.1.4       
[19] RSQLite_2.3.5      blob_1.2.4         Rcpp_1.0.12        pkgconfig_2.0.3    rstudioapi_0.15.0  R6_2.5.1          
[25] class_7.3-22       tidyselect_1.2.0   utf8_1.2.4         pillar_1.9.0       magrittr_2.0.3     tools_4.3.2       
[31] proxy_0.4-27       bit64_4.0.5        units_0.8-5        cachem_1.0.8 

Here's my session info. I don't think it was clear --- I'm using the most recent version of spatialite downloaded through homebrew, which I downloaded after running into the same error Edzer describes above (# Error: no such function: InitSpatialMetadata. After pointing to the mod_spatialite path from the homebrew download though, I can get it to run spatialite commands just fine.

# new test database
con <- dbConnect(RSQLite::SQLite(), 
                 dbname = 'test.sqlite',
                 loadable.extensions=TRUE,
                 extended_types=TRUE)

# path to spatialite library
dbExecute(con, paste0("SELECT load_extension('",'/opt/homebrew/Cellar/libspatialite/5.1.0/lib/mod_spatialite',"');"))
dbExecute(con, "SELECT InitSpatialMetadata(1);")

When I run the example you've provided @rsbivand , I get a new error that doesn't show up when I try and write directly to an open DBI connection. This perhaps suggests that I need to rebuild GDAL with libspatialite support?

st_write(nc, fn2, driver="SQLite", dataset_options=c("SPATIALITE=YES"))
options:        SPATIALITE=YES 
Creating dataset /var/folders/k_/0xygfndx0_n142bd22svgbh40000gn/T//RtmpYo5OOz/file1110944a17852.sqlite failed.
Error: Creation failed.
In addition: Warning messages:
1: In CPL_write_ogr(obj, dsn, layer, driver, as.character(dataset_options),  :
  GDAL Message 6: driver SQLite does not support creation option SPATIALITE
2: In CPL_write_ogr(obj, dsn, layer, driver, as.character(dataset_options),  :
  GDAL Error 6: OGR was built without libspatialite support
... sorry, creating/writing any SpatiaLite DB is unsupported
rsbivand commented 9 months ago

As I said, the CRAN sf binary does not support spatialite because the GDAL static library that is built against is not built with spatialite, but does support the METADATA key for the SQLite driver. homebrew is not supported anyway, so go with either CRAN binaries or install from source with static libraries as CRAN. Why? Because the homebrew build train may use different compilers and/or compiler options from R, so causing bugs that are very hard to diagnose. Why is spatialite important?

bradleyswilson commented 9 months ago

I just reinstalled GDAL from source and rebuilt sf against the new libs and can get a version that works now. It still doesn't seem to work when I pass an open DBI connection instead of the file path to the SQLite file, but that's not a dealbreaker. I'll go ahead and close. SQlite/Spatialite seemed like an attractive option for my particular use case that requires database-like querying but not the overhead of Postgres/PostGIS. I may look into other options given the lack of support in the CRAN binary.