r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.32k stars 293 forks source link

Support st_read GDAL/Ogr2ogr options #2382

Open latot opened 4 months ago

latot commented 4 months ago

Hi, I know it exists the options for st_read, but after testing it, seems is not to pass options as read options, or something seems to be not working.

I'm using the query param, which usually works, but some queries need to use in GDAL -dialect sqlite to works, I have played with options with... c("-dialect", "sqlite"), c("dialect", "sqlite"), c("-dialect"="sqlite"), c("dialect"="sqlite") and -dialect sqlite, but seems to now works.

Here a sample:

file_path <- "test.shp"

a <- sf::st_sf(
  id = c(1, 3, 3),
  geometry = sf::st_sfc(
    sf::st_point(),
    sf::st_point(),
    sf::st_point()
  )
)

sf::st_write(a, file_path, layer = "layer")

sf::st_read(
    file_path,
    query = 'SELECT 1 FROM (SELECT COUNT(*) count FROM "layer" GROUP BY "id") x WHERE count > 1',
    options = c("-dialect", "sqlite")
)

This query will fails without using the sqlite dialect.

Thx!

rsbivand commented 4 months ago

Minimal reproducible example? Details of how the input file was created?

latot commented 4 months ago

@rsbivand I have updated with a full reprex, run this query on GDAL with -dialect sqlite works. It must be saved to a shp file.

rsbivand commented 4 months ago

From https://gdal.org/user/ogr_sql_dialect.html, -dialect is an option of ogr2ogr. Could you try gdal_utils using "vector_ translate" to make the selection then read the output file?

latot commented 4 months ago

well.. is an option, but not ideal, the default query for GDAL is very limitated, for complex queries we need be able to specify the dialect.

Also, if we want to skip using the IO, we would need to write in memory, which with each step causes to be harder and harder be able to use SQL... for a complex program use the write in memory will also increase even more the complexity...

edzer commented 4 months ago

The shapefile open options are found here.

latot commented 4 months ago

@edzer this issue is not about drive specific options, is about GDAL reading options, like the ogr2ogr ones.

Queries can use different dialects to run, and we can't acutally parse options to GDAL, which could allow use to use -dialect sqlite to run complex queries.

edzer commented 4 months ago

The st_read docs say: option: driver dependent dataset open options. I pointed you to these options; they do not include -dialect.

rsbivand commented 4 months ago
file_path <- file.path(tempdir(), "layer.shp")
st_write(a, file_path)
st_layers(file_path)
st_read(file_path)
sql_path <- file.path(tempdir(), "layer.sqlite")
gdal_utils("vectortranslate", file_path, sql_path, options=c("-of", "SQLITE")) 
st_layers(sql_path)
aa <- sf::st_read(sql_path, query = 'SELECT 1 FROM (SELECT COUNT(*) count FROM "layer" GROUP BY "id") x WHERE count > 1')
str(aa)

Your use of the layer name is fragile, initially you had test.shp which only has one layer called test:

> file_path <- file.path(tempdir(), "test.shp")
> sf::st_write(a, file_path)
writing: substituting ENGCRS["Undefined Cartesian SRS with unknown unit"] for missing CRS
Writing layer `test' to data source 
  `/tmp/RtmpBvumK2/test.shp' using driver `ESRI Shapefile'
Writing 3 features with 1 fields and geometry type Point.
> st_layers(file_path)
Driver: ESRI Shapefile 
Available layers:
  layer_name geometry_type features fields
1       test         Point        3      1
                                   crs_name
1 Undefined Cartesian SRS with unknown unit

Whether this is what you want is unknown. Do use gdal_utils("vector_translate", ..., that is what it is there for, and is the only place you can implement choice of dialect. It is not a read option.

latot commented 4 months ago

Hi, the is just a minimum reproducible example, is not designed to be useful in all things or places.

This issue to to request support for GDAL options from st_read, how the same function has query as a parameter, I think the best for it would also support dialect, at the same time, I think could be more useful implement GDAL options than just dialect.

edzer commented 4 months ago

... then please correct the issue title.

latot commented 4 months ago

ready! thx :D I was not sure if was a bug or not, didn't know if the options already supported the ogr2ogr options or only the driver options.

Title updated.

agila5 commented 4 months ago

Might be related to https://github.com/r-spatial/sf/pull/1646 but, unfortunately, I don't know how to fix the issue that prevented merging that PR...

latot commented 4 months ago

well, I have a different opinion on https://github.com/r-spatial/sf/pull/1646#issuecomment-830583550 comment.

GDAL has a lot of dependencies.... and clearly you can't use a feature without them, for example following that issue, -dialect sqlite needs spatialite, but why not include it? without spatialite we will can't read gpkg files, so, following the logic, SF should remove support for GPKG, because is spatialite dependent.

Lets also add, all the logic about exclude/include based on dependencies, could also be applied by versions.

GDAL is big.... I think the most usable way is support everything from SF and keep to the user control which dependencies want and which versions (yes, the same logic about dependencies, could be applied to versions on GDAL and their dependencies)

edzer commented 4 months ago

without spatialite we will can't read gpkg files

without spatialite GDAL can read gpkg files; the only dependency is sqlite3.

The ogr sqlite dialect comes "preferrably with spatialite support", as it needs that for the spatial operators; those however are the main reason to get that, I guess.

Using gpkg these days has become very common, gdal installations with spatialite can be built but seem not very common.