UF5: The documentation for object conversion to a bounding box when using the spatial predicates could be improved.
The message the user sees:
The object is too large to perform exact spatial operations using bcdata.
To simplify the polygon, a bounding box was drawn around the polygon and all
features within the box will be returned. Options include further processing
with on the returned object or simplify the object.
does not say what size threshold was used, and how much it was exceeded by. If I would prefer to take the simplification option, assuming it is a genuine option, it would be nice to know what target I am aiming for and how close I am to it.
The first place I would look to demystify this would be the help file for the spatial predicates (cql_geom_predicates), but I think the help is underdone with respect to this functionality. It should probably mention the internal option bcdata.max_geom_pred_size and how the size is calculated. Alternately it could refer users to the efficiently-query-spatial... vignette for this information.
My recommendation would be to change this so that the size is calculated just on the data that will be translated to text via your sf::st_as_text call. This will make the code a bit more cumbersome, but s3 methods for size calculation might be used to mitigate that. With this change the sizing of objects is much more predictable, making it a simpler task to bring data below the threshold.
Users could then be told it's a case of utils::object.size(st_geometry(obj)) <= getOption("bcdata.max_geom_pred_size") or a bounding box will be used.
If you wanted to be really kind you could also expose a function that helps the user determine if a spatial object will be converted to a bounding box in their predicates.
If you don't take my recommendation, then I think the size calculation is surprising enough that it needs to be well documented.
A summary of UF5 suggestions:
Improve warning message with object size and size threshold.
Improve help file with details of how size is calculated and how to change the threshold
or refer users to vignette for this info
Restrict size calculation to spatial data only, or clearly explain all data is included in sizing objects with respect to threshold.
UF5: The documentation for object conversion to a bounding box when using the spatial predicates could be improved.
The message the user sees:
does not say what size threshold was used, and how much it was exceeded by. If I would prefer to take the simplification option, assuming it is a genuine option, it would be nice to know what target I am aiming for and how close I am to it.
The first place I would look to demystify this would be the help file for the spatial predicates (
cql_geom_predicates
), but I think the help is underdone with respect to this functionality. It should probably mention the internal optionbcdata.max_geom_pred_size
and how the size is calculated. Alternately it could refer users to theefficiently-query-spatial...
vignette for this information.On that point, I noticed in the source that the size is calculated on the entire spatial object, rather than just the geometry as I would have expected. See: https://github.com/bcgov/bcdata/blob/master/R/cql-geom-predicates.R#L114.
My recommendation would be to change this so that the size is calculated just on the data that will be translated to text via your
sf::st_as_text
call. This will make the code a bit more cumbersome, but s3 methods for size calculation might be used to mitigate that. With this change the sizing of objects is much more predictable, making it a simpler task to bring data below the threshold.Users could then be told it's a case of
utils::object.size(st_geometry(obj)) <= getOption("bcdata.max_geom_pred_size")
or a bounding box will be used.If you wanted to be really kind you could also expose a function that helps the user determine if a spatial object will be converted to a bounding box in their predicates.
If you don't take my recommendation, then I think the size calculation is surprising enough that it needs to be well documented.
A summary of UF5 suggestions:
Originally posted by @MilesMcBain in https://github.com/bcgov/bcdata/issues/239#issuecomment-762230962