Closed sebdalgarno closed 2 years ago
test_that("fwa_collection offset errors at 100,000", {
collectionid <- "whse_basemapping.fwa_stream_networks_sp"
expect_silent(fwa_collection(collectionid, offset = 99999, limit = 1))
expect_error(fwa_collection(collectionid, offset = 100000, limit = 1),
"Invalid value for parameter offset")
})
test_that("fwa_collection offset works at 100,001 !!!!", {
collectionid <- "whse_basemapping.fwa_stream_networks_sp"
expect_silent(fwa_collection(collectionid, offset = 100001, limit = 1))
})
@smnorris - what do you think of this behaviour
For now I've added a check on offset < 100000 because I'm not sure that the result returned with offset 100001 is valid!
Behaviour is weird! I'll see if I can debug.
Both of these seem to work? What are the exact requests that bail?
Well, they provide a response anyway. The requests aren't very useful because they have to be ordered by something for the offset to work reliably.
Ordering by linear_feature_id
should fix this... but the server returns a different feature each time below request is made, the orderBy
doesn't seem to be working:
This is the log output from pg_featureserv
INFO[0270] 127.0.0.1:36858 GET /collections/whse_basemapping.fwa_stream_networks_sp/items.json?orderBy=linear_feature_id&limit=1&offset=100000
DEBU[0270] Features query: SELECT ST_AsGeoJSON( ST_Transform( "geom", 4326) ) AS _geojson , "linear_feature_id","watershed_group_id","edge_type","blue_line_key","watershed_key","fwa_watershed_code"::text,"local_watershed_code"::text,"watershed_group_code"::text,"downstream_route_measure","length_metre","feature_source"::text,"gnis_id","gnis_name"::text,"left_right_tributary"::text,"stream_order","stream_magnitude","waterbody_key","blue_line_key_50k","watershed_code_50k"::text,"watershed_key_50k","watershed_group_code_50k"::text,"gradient","feature_code"::text,"wscode_ltree"::text,"localcode_ltree"::text,"upstream_route_measure","upstream_area_ha" FROM "whse_basemapping"."fwa_stream_networks_sp" LIMIT 1 OFFSET 10000;
DEBU[0270] Database query result: 1 rows in 383.705893ms
DEBU[0270] ---- Request complete in 385.779509ms
The documentation I was referring to (https://access.crunchydata.com/documentation/pg_featureserv/latest/usage/query_data/) refers to the wrong parameter name, it should be sortBy
, not orderBy
- this is getting closer:
The issue seems to be that the max offset value input to the query from the server is 10k. This is the log output from this request:
INFO[1020] 127.0.0.1:36904 GET /collections/whse_basemapping.fwa_stream_networks_sp/items.json?sortBy=linear_feature_id&limit=1&offset=11000
DEBU[1020] Features query: SELECT ST_AsGeoJSON( ST_Transform( "geom", 4326) ) AS _geojson , "linear_feature_id","watershed_group_id","edge_type","blue_line_key","watershed_key","fwa_watershed_code"::text,"local_watershed_code"::text,"watershed_group_code"::text,"downstream_route_measure","length_metre","feature_source"::text,"gnis_id","gnis_name"::text,"left_right_tributary"::text,"stream_order","stream_magnitude","waterbody_key","blue_line_key_50k","watershed_code_50k"::text,"watershed_key_50k","watershed_group_code_50k"::text,"gradient","feature_code"::text,"wscode_ltree"::text,"localcode_ltree"::text,"upstream_route_measure","upstream_area_ha" FROM "whse_basemapping"."fwa_stream_networks_sp" ORDER BY "linear_feature_id" LIMIT 1 OFFSET 10000;
DEBU[1020] Database query result: 1 rows in 427.282033ms
DEBU[1020] ---- Request complete in 431.749204ms
I don't think this is a config issue, these config paging params shouldn't affect the offset?
[Paging]
# The default number of features in a response
LimitDefault = 10000
# Maxium number of features in a response
LimitMax = 10000
These are the two queries: the first with an offset of 100001 returns an empty data set but the second with a smaller offset of exactly 100000 throws an error about the offset being too big.
I think the first query should also throw an error.
> fwa_query_collection(collection_id, offset = 100001)
Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams]
Date: 2021-09-17 23:51
Status: 200
Content-Type: application/json
Size: 1.18 kB
Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=10000&offset=100001]
Date: 2021-09-17 23:51
Status: 200
Content-Type: application/geo+json
Size: 452 B
<BINARY BODY>
Simple feature collection with 0 features and 0 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Geodetic CRS: WGS 84
# A data frame: 0 × 1
# … with 1 variable: geometry <GEOMETRY [°]>
> fwa_query_collection(collection_id, offset = 100000)
Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams]
Date: 2021-09-17 23:51
Status: 200
Content-Type: application/json
Size: 1.18 kB
Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=10000&offset=1e%2B05]
Date: 2021-09-17 23:51
Status: 400
Content-Type: text/plain; charset=utf-8
Size: 42 B
Invalid value for parameter offset: 1e+05
Error: API request failed [400]: Invalid value for parameter offset: 1e+05
.
Not totally sure what the issue is here - both these requests return an empty response:
Perhaps error noted above is an encoding issue in the request?
Error: API request failed [400]: Invalid value for parameter offset: 1e+05
Empty responses for these requests are expected, there are only 11,767 records in that table but pg_fs doesn't know that.
To make paging work better I can add a record count meta table that the R client can query (https://github.com/smnorris/fwapg/issues/56), but hopefully it won't be necessary in future versions of pg_fs
Hold off on the record count meta table for now - I'll let you know if I think we need it.
this was an issue on our end caused by conversion of numeric to character for the url string:
d> as.character(100000)
[1] "1e+05"
d> as.character(100001)
[1] "100001"
fixed now in pgfsr with format(x, scientific = FALSE)
Hilarious!
Yes this is from the API - I confirm its still happening with a test