poissonconsulting / fwapgr

An R Client for BC Freshwater Atlas Web API
https://poissonconsulting.github.io/fwapgr
Other
8 stars 1 forks source link

behaviour of offset prevents pagination! #49

Closed joethorley closed 3 years ago

joethorley commented 3 years ago
  collection_id <- "whse_basemapping.fwa_named_streams"

  collection <- fwa_collection(collection_id, offset = 997, limit = 2)
  collection2 <- fwa_collection(collection_id, offset = 998, limit = 1)
  expect_false(identical(collection2$id, collection$id[2])) ## this should be true
smnorris commented 3 years ago

Sort by the unique id to get offset working:

https://www.hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=2&offset=997

https://www.hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=1&offset=998

joethorley commented 3 years ago

Is there anyway to automatically get the name of the unique id so that the user doesn't have to worry about it?

joethorley commented 3 years ago

it seems to always be the first column of the collection

joethorley commented 3 years ago

When I do this for named streams it works up to offset = 10,000

library(fwapgr)

collection_id <- "whse_basemapping.fwa_named_streams"

fwa_query_collection(collection_id, offset =  9999, limit = 3)
#> Simple feature collection with 3 features and 6 fields
#> Geometry type: MULTILINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: -126.6767 ymin: 50.04093 xmax: -120.7701 ymax: 54.79815
#> Geodetic CRS:  WGS 84
#> # A data frame: 3 × 7
#>   id    blue_line_key fwa_stream_networ… gnis_name stream_order watershed_group…
#>   <chr>         <int>              <int> <chr>            <int> <chr>           
#> 1 10000     360882493              10000 Stimson …            2 BABL            
#> 2 10001     356295575              10001 Stinson …            3 MAHD            
#> 3 10002     356155114              10002 Stirling…            2 LNIC            
#> # … with 1 more variable: geometry <MULTILINESTRING [°]>
fwa_query_collection(collection_id, offset = 10000, limit = 2)
#> Simple feature collection with 2 features and 6 fields
#> Geometry type: MULTILINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: -120.8317 ymin: 50.04093 xmax: -120.7701 ymax: 52.09135
#> Geodetic CRS:  WGS 84
#> # A data frame: 2 × 7
#>   id    blue_line_key fwa_stream_networ… gnis_name stream_order watershed_group…
#>   <chr>         <int>              <int> <chr>            <int> <chr>           
#> 1 10001     356295575              10001 Stinson …            3 MAHD            
#> 2 10002     356155114              10002 Stirling…            2 LNIC            
#> # … with 1 more variable: geometry <MULTILINESTRING [°]>
fwa_query_collection(collection_id, offset = 10001, limit = 1)
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: MULTILINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: -120.7949 ymin: 52.0704 xmax: -120.7868 ymax: 52.09135
#> Geodetic CRS:  WGS 84
#> # A data frame: 1 × 7
#>   id    blue_line_key fwa_stream_networ… gnis_name stream_order watershed_group…
#>   <chr>         <int>              <int> <chr>            <int> <chr>           
#> 1 10001     356295575              10001 Stinson …            3 MAHD            
#> # … with 1 more variable: geometry <MULTILINESTRING [°]>

Created on 2021-09-10 by the reprex package (v2.0.1)

joethorley commented 3 years ago

This is the url for the last erroneous example (it should have id 10002)

https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?limit=1&offset=10001&sortBy=fwa_stream_networks_label_id

changing the order of sortBy doesn't seem to make any difference

https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=1&offset=10001
smnorris commented 3 years ago

Yes, the primary key should be the first column, generally the table name or very similar plus _id. For source FWA columns refer to the DataBC documentation for column descriptions but for the generated/value added tables like named streams I will add column descriptions. It is handy that pg_fs displays them by default on the collection page but I am thinking I'll add the table definitions to fwapg documentation.

Unfortunately I don't think there is a way to automatically flag the pk in a request response. For WFS requests in bcdata I tried guessing but generally have to remember to look it up before making big requests. https://github.com/smnorris/bcdata/issues/57

That error is odd though... I'll start the server in debug mode and see what is going on.

smnorris commented 3 years ago

This is the issue. https://github.com/CrunchyData/pg_featureserv/issues/79 I can compile and install the latest code with the fix.

joethorley commented 2 years ago

I can confirm that it is fixed