r-dbi / RMariaDB

An R interface to MariaDB
https://rmariadb.r-dbi.org
Other
130 stars 39 forks source link

cannot write sf dataframe geometry column to MariaDB table #297

Open frabau1 opened 1 year ago

frabau1 commented 1 year ago

I am trying to use MariaDB to store GIS data queried from Open Street Map using R, however the MariaDB driver is failing to write the geometry column from R to MariaDB. I get this error: "Error: Cannot get geometry object from data you send to the GEOMETRY field [1416]".

Geodatabase support in MariaDB .

I have tried other sf data frames obtained from other sources (imported shapefiles, files from the tigris package, etc.) and get the same error.

library(osmdata)
library(sf)
library(dplyr)
library(RMariaDB)

#OSM overpass query
q1 <- opq(bbox = c(-78.895569,38.236023,-75.050354,39.584524)) %>% 
  add_osm_feature(key = 'highway', value = c('path','footway')) %>%
  add_osm_feature(key = 'surface', value = c('dirt','unpaved','ground','rock', 'compacted')) %>%
  add_osm_feature(key = 'bicycle', value = c('designated','yes','permissive')) 

#fetch sf results
q1 <- q1 %>% 
  osmdata_sf()

#pull out lines
q1_lines <- q1$osm_lines

#pull out polygons - closed loop trails erroniously show as polygons
q1_poly <- q1$osm_polygons

#convert polygons to linestrings
q1_converted <- st_cast(q1_poly,"LINESTRING")

#recombine
q1 <- bind_rows(q1_lines, q1_converted)

#drop redundant column 
q1 <- q1 %>% 
  select(-fixme)

con <- dbConnect(
  drv = RMariaDB::MariaDB(), 
  dbname='dbname',
  username = 'user',
  password = 'pass', 
  host = 'ipaddress', 
  port = portnum
)

q2 <- st_as_sf(q1)

dbWriteTable(con, name="mtb_trails", q1_lines)
krlmlr commented 1 year ago

Thanks, confirming.

library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(RMariaDB)

# OSM overpass query
q1 <- opq(bbox = c(-78.895569, 38.236023, -75.050354, 39.584524)) %>%
  add_osm_feature(key = "highway", value = c("path", "footway")) %>%
  add_osm_feature(key = "surface", value = c("dirt", "unpaved", "ground", "rock", "compacted")) %>%
  add_osm_feature(key = "bicycle", value = c("designated", "yes", "permissive"))

# fetch sf results
q1 <- q1 %>%
  osmdata_sf()

# pull out lines
q1_lines <- q1$osm_lines

# pull out polygons - closed loop trails erroniously show as polygons
q1_poly <- q1$osm_polygons

# convert polygons to linestrings
q1_converted <- st_cast(q1_poly, "LINESTRING")
#> Warning in st_cast.sf(q1_poly, "LINESTRING"): repeating attributes for all
#> sub-geometries for which they may not be constant

# recombine
q1 <- bind_rows(q1_lines, q1_converted)

# drop redundant column
q1 <- q1 %>%
  select(-fixme, -FIXME)

con <- dbConnect(drv = RMariaDB::MariaDB(), dbname = "test")

q2 <- st_as_sf(q1)

options(error = rlang::entrace)
dbWriteTable(con, name = "mtb_trails", q2, overwrite = TRUE)
#> Note: method with signature 'DBIObject#sf' chosen for function 'dbDataType',
#>  target signature 'MariaDBConnection#sf'.
#>  "MariaDBConnection#ANY" would also be valid
#> Error: Cannot get geometry object from data you send to the GEOMETRY field [1416]

Created on 2023-05-19 with reprex v2.0.2

Looks like writing tables with geometries is a feature of the sf package. @edzer: who could help here?

krlmlr commented 7 months ago

@dpprdan: Would you know about the current status of sf vs. MariaDB?

dpprdan commented 7 months ago

@krlmlr I don't know anything about it right now, but I can take a look. Well, I will, because now I'm intrigued. It may take a couple of days until that happens, though.

dpprdan commented 7 months ago

{sf} does not have MariaDB-specific dbWriteTable() methods, yet.

library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.8.2, PROJ 9.3.1; sf_use_s2() is TRUE
methods(dbWriteTable)
#> [1] dbWriteTable,DBIConnection,Id,ANY-method             
#> [2] dbWriteTable,DBIObject,character,sf-method           
#> [3] dbWriteTable,PostgreSQLConnection,character,sf-method
#> see '?methods' for accessing help and source code

{sf}’s DBI-related functions reside in https://github.com/r-spatial/sf/blob/main/R/db.R. Only {RPostgres} and {RPostgreSQL} are supported at the moment. MariaDB support would have to be implemented there as well. (relatedly: https://github.com/ecohealthalliance/doltr/issues/60).

However, it is possible to use the GDAL-MySQL-driver to write to MariaDB from {sf}, at least in principle.

nc <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

st_write(
  obj = nc,
  dsn = "MySQL:test,user=root,password=mysecretpassword",
  layer = "nc"
)
#> Writing layer `nc' to data source 
#>   `MySQL:test,user=root,password=mysecretpassword' using driver `MySQL'
#> Writing 100 features with 14 fields and geometry type Multi Polygon.

“In principle”, because a round-trip to and from the DB does not return the same object. The CRS is lost, for example.

nc_maria <-
  st_read(
    dsn = "MySQL:test,user=root,password=mysecretpassword",
    layer = "nc"
  )
#> Reading layer `nc' from data source 
#>   `MySQL:test,user=root,password=mysecretpassword' using driver `MySQL'
#> Simple feature collection with 100 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> CRS:           NA

all.equal(nc, nc_maria)
#> [1] "Names: 15 string mismatches"                                                                                           
#> [2] "Attributes: < Component \"agr\": Names: 14 string mismatches >"                                                        
#> [3] "Attributes: < Component \"sf_column\": 1 string mismatch >"                                                            
#> [4] "Component 15: Attributes: < Component \"crs\": Component \"input\": 'is.NA' value mismatch: 1 in current 0 in target >"
#> [5] "Component 15: Attributes: < Component \"crs\": Component \"wkt\": 'is.NA' value mismatch: 1 in current 0 in target >"

Looking at the DB table with QGIS, the CRS seems to get lost while reading it back to {sf}. Unless I am doing something wrong here?

Either way, this isn’t a RMariaDB issue, IMHO.