Closed ateucher closed 1 year ago
I think we need to make a somewhat major change in how we parse user input into filter()
to make this work smoothly. Right now we have some ugly code that tries to detect parts of a filter
call that should be evaluated locally (e.g., a call to an sf
function) and then modifying the call by wrapping it in local()
. partial_eval()
then processes that call and forces that evaluation of the local()
bits before turning the call into a CQL
string. But this is fragile and can't reliably detect all cases. The recent change to partial_eval
has made it stricter and it now throws errors on unevaluated calls that should have been evaluated and vice-versa.
The existing approach in bcdata
actually goes against the way dbplyr
was designed to be used - basically we should be asking the user to wrap parts of calls that should be evaluated locally in local()
, rather than trying to guess and do it for them.
I propose we get rid of the stuff in cql_translate()
that tries to figure out what should be locally evaluated, and ask the user to do it specifically. This is likely to break some people's code, so we will need to think about how to communicate the change and help people adjust.
Essentially, instead of something like:
bcdc_query_geodata("a-record") |>
filter(OBJ_AREA_SQM > st_area(a_local_sf_object)) # OBJ_AREA_SQM is a field in the wfs object
would need to be something like:
bcdc_query_geodata("a-record") |>
filter(OBJ_AREA_SQM > local(st_area(a_local_sf_object))) # OBJ_AREA_SQM is a field in the wfs object
Thoughts @boshek @stephhazlitt ?
@boshek @stephhazlitt I've made the change. Everything passes nicely locally with the dev version of dbplyr. Haven't started on the documentation yet. The updates to the test files illustrate the user-facing changes.
This looks great @ateucher 🙏. I think it is a fair trade off re: requiring users to use local()
in return for cleaner, more predictable+robust code. Pretty sure we have some use cases in the vignettes that will need updating with this change. We could think about some active advertising of the change, e.g. a tweet thread? Maybe even a new, short pkgdown article?
Thanks @stephhazlitt. I think that the vignettes are actually ok... I did a manual check and precompiled them with the dev version of bcdata and they worked. I think both your communication ideas are good - a tweet thread, and either a new vignette or just add a section to the efficiently_query_geospatial_data
vignette.
Ok, I've updated one of the vignettes with a simple example... let me know what you think. Unrelated, but I finally figured out how to clean up those spurious dbQuoteIdentifier
and dbQuoteString
messages we've been getting (https://github.com/bcgov/bcdata/pull/305/commits/92146dbcc75028602a59ab356ba43bb67b262adf)
/precompile
Good catch - and it's awful:
library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(bcdata)
#>
#> Attaching package: 'bcdata'
#> The following object is masked from 'package:stats':
#>
#> filter
the_geom <- st_sfc(st_point(c(1164434, 368738)),
st_point(c(1203023, 412959)),
crs = 3005)
bcdc_query_geodata("local-and-regional-greenspaces") %>%
filter(DWITHIN(st_buffer(the_geom, 10000, nQuadSegs = 2), 100, "meters"))
#> Authorizing with your stored API key
#> Warning: It is advised to use the permanent id ('6a2fea1b-0cc4-4fc2-8017-eaf755d516da') rather than the name of the record ('local-and-regional-greenspaces') to guard against future name changes.
#> Warning: Named arguments ignored for SQL st_buffer
#> Error in UseMethod("escape"): no applicable method for 'escape' applied to an object of class "c('sfc_POINT', 'sfc')"
Created on 2022-10-25 with reprex v2.0.2
It's really deep down in the stack and trying to preempt it might be really tricky. I think it might be best to just use a try
and create a custom error if the error looks like "no applicable method for 'escape' applied to..."
Yeah that's a great thought. I think that's a good idea and we can see how that works.
Thanks @stephhazlitt and @boshek!
Partially working but I still think it's more complex than need be and the built-in DBI/dbplyr function mapping is not working quite right...
304