bcgov / bcdata

An R package for searching & retrieving data from the B.C. Data Catalogue
https://bcgov.github.io/bcdata
Apache License 2.0
82 stars 12 forks source link

Error: There was an issue sending this WFS request #330

Closed gcperk closed 1 year ago

gcperk commented 1 year ago

Hello R nerds,

I built a package to estimate Moose Habitat which extracts a series of spatial files through bcdata (VRI, water, fire, harvest etc).
The script is set to run through a series of smaller polygons and extracts and clips the data (within bcdata) then returns the outputs to a separate folder (get_basedata.r). The vignette and sample data are here.

I have started getting WSF errors when running the get_basedata.R function: image

I suspect these are some kind of timeout error or set up on my machine, but they seems to occur cumulatively, i.e. first appear with VRI and large files , then for any file I wish to download. To the point where I cant download any files no matter how small.

This may sound strange but when I try the next day the downloads work again, but then seem to build up these errors til the same error is thrown on every download. Would there be a max limit on downloads? or data extraction through the bc catalogue?

Also, the rest of bcdata package seems to work fine: ie .... image

Thanks in advance - this one has me stumped!

boshek commented 1 year ago

I would not surprised at all if your IP is restricted to some threshold number of calls.

Is it possible to distill this down to a bcdata specific reprex? Something that can reproduce the issue reliably?

gcperk commented 1 year ago

Thanks @boshek, sorry here is something more useful,

I am getting the error when I run this basic code........

districts <- bcdc_query_geodata("78ec5279-4534-49a1-97e8-9d315936f08b") %>%
  filter(SCHOOL_DISTRICT_NAME %in% c("Greater Victoria", "Prince George","Kamloops/Thompson")) %>%
  select(SCHOOL_DISTRICT_NAME) %>%
  collect()

image

I don't think I am being too crazy with the downloads (approx. 10 layers downloaded x 20 different aoi's), but would this be set an internal BC data catalogue limit for the WFS service? ie outside the remit of bcdata? Thanks

UPDATE: I was working on this data set again today and the same issue started arising after about one hour of testing and downloading small chunks of data. Again this was just normal code development, not big downloads. Do you think bc data catalogue places more restrictions on download limits / number of access. I estimate less than 50 data calls all up with only small testing amounts? Just an idea?

stephhazlitt commented 1 year ago

I cannot reproduce the issue @gcperk:

library(bcdata)
#> 
#> Attaching package: 'bcdata'
#> The following object is masked from 'package:stats':
#> 
#>     filter

districts <-
  bcdc_query_geodata("78ec5279-4534-49a1-97e8-9d315936f08b") |>
  filter(
    SCHOOL_DISTRICT_NAME %in% c("Victoria", "Prince George", "Kamloops-Thompson")
  ) |> 
  select(SCHOOL_DISTRICT_NAME) |> 
  collect()

districts 
#> Simple feature collection with 3 features and 5 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 1126789 ymin: 362242.2 xmax: 1528155 ymax: 1224202
#> Projected CRS: NAD83 / BC Albers
#> # A tibble: 3 × 6
#>   id         ADMIN_AREA_SID SCHOOL_DISTRICT_NAME SCHOOL_DISTRICT_NUMBER OBJECTID
#> * <chr>               <int> <chr>                                 <int>    <int>
#> 1 WHSE_TANT…            328 Prince George                            57      562
#> 2 WHSE_TANT…            333 Victoria                                 61      567
#> 3 WHSE_TANT…            342 Kamloops-Thompson                        73      576
#> # ℹ 1 more variable: geometry <POLYGON [m]>

Created on 2023-09-26 with reprex v2.0.2

I cannot find anything related to rate limits in the BC Data Catalogue API documentation, or on the WMS service page. I think the next step might be to open a ticket with the Data System & Services team that own these products/services and ask about API use limits? https://dpdd.atlassian.net/servicedesk/customer/portal/1

gcperk commented 1 year ago

Thanks @stephhazlitt, I continue to get the same error after around 1 hour of working with the data. Interestingly this was not a problem when I developing code a few weeks ago. This makes me think there was some change in WMS access during this period (or could just be me!).

I will follow up with the Data System & Service Team and report back any findings on this issue. Thanks for looking into this!

stephhazlitt commented 1 year ago

@gcperk just found this with some Google-fu, contains details of rate limits with WFS https://bcgov.github.io/bcwebmaps-options/assets/images/DataBCWebMapServices2020.pdf (slide 9).

gcperk commented 1 year ago

Thanks @stephhazlitt , I just lodged a request anyway as it seems like a new change. But that is helpful to know!

ateucher commented 1 year ago

I think those rate limits apply to each request, rather than a daily limit, and bcdata does ensure that requests stay within that limit. That's not to say though that there isn't a daily limit, there may well be a new one imposed.

HTTP error code 503 means "service unavailable"... which in this case is a bit ambiguous.

rplewes commented 1 year ago

@gcperk I think I'm encountering a similar issue. I am getting the "Error: There was an issue sending this WFS request" for all bcdata queries that require paginated requests. Requests that do not require paginated requests work. I have put a reproducible example below:

big_wat -> bcdc_query_geodata(record = "97d8ef37-b8d2-4c3b-b772-6b25c1db13d0") %>% 
  filter(WATERSHED_GROUP_CODE == "PCEA" & GNIS_NAME_1 == "Carbon Creek") %>% 
  collect()  

small_wat -> big_wat %>% slice(1)  #only select 1  watershed

a non-paginated query works

streams_small -> bcdc_query_geodata(record = "92344413-8035-4c08-b996-65a9b3f62fca") %>% 
  filter(INTERSECTS(small_wat)) %>% 
  collect() 

does not work

streams_big -> bcdc_query_geodata(record = "92344413-8035-4c08-b996-65a9b3f62fca") %>% 
  filter(INTERSECTS(big_wat)) %>% 
  collect()
ateucher commented 1 year ago

Cross-linking here - the default page size has been reduced to 500 records (from 10000), meaning that many queries will be sending A LOT of individual paged requests. I think this could well be related

bcdata:::bcdc_single_download_limit()
#> [1] 500

Created on 2023-09-26 with reprex v2.0.2

smnorris commented 1 year ago

Thanks, I just got the 500 record limit heads up from @mpaivinen-bcgov as well. I'm making the fix in python bcdata - but finding that the service is really sluggish and may time out on basic requests.

ateucher commented 1 year ago

Thanks for the reprex @rplewes - that illustrates it nicely.

So we have two options bcdata.chunk_limit, and bcdata.single_download_limit, which I think is unnecessary, and they are conflicting. @boshek and @stephhazlitt I am going to try to reconcile them and simplify this... unless you have a memory that I'm missing, and know why we did this?

ateucher commented 1 year ago

332 is a start, but in working on it I managed to hit what I think is the daily limit that @gcperk ran into... After testing things a few times, I stopped being able to send any requests, even if not paginated.

ateucher commented 1 year ago

I am getting a 503 error with the content:

HTTP Status 503 – Service Unavailable

<hr class=\"line\" />

Type Status Report

Message Requested timeout out while waiting to be executed, please lower your request rate

Description The server is currently unable to handle the request due to a temporary overload or scheduled maintenance, which will likely be alleviated after some delay.

<hr class=\"line\" />

Apache Tomcat/9.0.17

Nothing in the response headers indicating a maximum request rate...

ateucher commented 1 year ago

It feels like something here has been implemented, but I don't see it in the response headers

ateucher commented 1 year ago
curl -d "SERVICE=WFS&VERSION=2.0.0&REQUEST=GetFeature&outputFormat=application%2Fjson&typeNames=WHSE_BASEMAPPING.FWA_ASSESSMENT_WATERSHEDS_POLY&SRSNAME=EPSG%3A3005&CQL_FILTER=%28%22WATERSHED_GROUP_CODE%22%20%3D%20%27PCEA%27%20AND%20%22GNIS_NAME_1%22%20%3D%20%27Carbon%20Creek%27%29" https://openmaps.gov.bc.ca/geo/pub/wfs -v
*   Trying 142.34.140.41:443...
* Connected to openmaps.gov.bc.ca (142.34.140.41) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=CA; ST=British Columbia; L=Victoria; O=Government of the Province of British Columbia; CN=*.data.gov.bc.ca
*  start date: Sep 28 22:33:08 2022 GMT
*  expire date: Oct 16 22:33:08 2023 GMT
*  subjectAltName: host "openmaps.gov.bc.ca" matched cert's "openmaps.gov.bc.ca"
*  issuer: C=US; O=Entrust, Inc.; OU=See www.entrust.net/legal-terms; OU=(c) 2012 Entrust, Inc. - for authorized use only; CN=Entrust Certification Authority - L1K
*  SSL certificate verify ok.
* using HTTP/1.1
> POST /geo/pub/wfs HTTP/1.1
> Host: openmaps.gov.bc.ca
> User-Agent: curl/8.1.2
> Accept: */*
> Content-Length: 272
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 200
< Date: Tue, 26 Sep 2023 22:34:20 GMT
< Server: Apache
< X-Frame-Options: allow-from (null)
< X-Control-flow-delay-ms: 0
< Content-Disposition: inline; filename=geoserver-GetFeature.application
< Content-Type: application/json;charset=UTF-8
< Set-Cookie: GS_FLOW_CONTROL=GS_CFLOW_39c22f56:18ad3a02f18:19d
< Access-Control-Allow-Origin: (null)
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Methods: POST, GET, OPTIONS, HEAD
< Access-Control-Allow-Headers: X-Requested-With, Referer, Origin, Content-Type, SOAPAction, Authorization, Accept
< Access-Control-Max-Age: 1000
< X-Geo-Cache: miss
< Transfer-Encoding: chunked
<
{"type":"FeatureCollection","features":[{"type":"Feature","id":"WHSE_BASEMAPPING.FWA_ASSESSMENT_WATERSHEDS_POLY.13077","geometry":{"type":"Polygon","coordinates":[[[1211935.117,1225741.596],[1211913.934,1225772.707],[1211890.264,1225802.264],[1211867.834,1225828.75],[1211856.424,1225854.475],[1211839.312,1225881.187],[1211816.498,1225908.883],[1211786.671,1225937.065],[1211757.925,1225963.284],[1211741.258,1225982.096],[1211730.192,1226004.937],[1211724.727,1226031.805],[1211715.233,1226085.714],[1211698.639,1226144.788],[1211672.329,1226207.912],[1211649.579,1226252.451],[1211633.453,1226292.586],[1211627.789,1226308.405],[1211622.041,1226326.229],[1211611.223,1226351.086],[1211602.695,1226374.257],[1211596.884,1226401.445],[1211590.754,1226428.286],[1211584.109,1226459.453],[1211579.591,1226479.67],[1211577.201,1226488.937],[1211572.069,1226515.82],[1211567.679,1226548.755],[1211566.6,1226597.89],[1211566.814,1226640.053],[1211568.377,1226676.586],[1211572.157,1226710.536],
...
smnorris commented 1 year ago

Yes, I can make a single page request repeatedly without issue now, but was not able to 3 hours ago.

ateucher commented 1 year ago

Interesting, maybe they've made a change already!

gcperk commented 1 year ago

@ateucher - I just retried the reprex from @rplewes and I'm still getting an pagination error, could it be how it is paged up? ie 1040 record = 2 pages... but should it now be 4 pages with 500 limit?

image

stephhazlitt commented 1 year ago

@gcperk you need to install the branch to get the PR fix (until it is merged):

install_github("bcgov/bcdata", ref = "fix-download-chunk-limit")

Which fixes the pagination, but I am still erroring on the large request 😔.

library(bcdata)
#> 
#> Attaching package: 'bcdata'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

big_wat <-bcdc_query_geodata(record = "97d8ef37-b8d2-4c3b-b772-6b25c1db13d0") |>
  filter(WATERSHED_GROUP_CODE == "PCEA" &
           GNIS_NAME_1 == "Carbon Creek") |>
  collect()

small_wat <- big_wat |>  slice(1) #only select 1 watershed

streams_small <- bcdc_query_geodata(record = "92344413-8035-4c08-b996-65a9b3f62fca") |>  filter(INTERSECTS(small_wat)) |>
  collect()

streams_big <- bcdc_query_geodata(record="92344413-8035-4c08-b996-65a9b3f62fca") |> 
  filter(INTERSECTS(big_wat)) |>
  collect()
#> This object has 1040 records and requires 3 paginated requests to complete.
#> Retrieving data
#> Error: There was an issue sending this WFS request
#> 3 paginated requests issued

Created on 2023-09-26 with reprex v2.0.2

gcperk commented 1 year ago

Cool thanks @stephhazlitt so it seems it is more than the paginated error?

bruce-tonkin commented 1 year ago

In the last two weeks the warehouse has been experiencing very high loads due to requests originating from GeoServer. The issue was identified by all the temporary space being consumed, by iterating requests for downloads of 10 000 records at a time. Unfortunately logs do not identify the clients making those requests. Further investigation identified a configuration error in GeoServer, which was allowing these large requests... which should have been 500 from day one. This is potentially the cause of most if not all of your issues. My suggestion going forward.

  1. Read the capabilities document for where this is set, then (https://openmaps.gov.bc.ca/geo/ows?service=wfs&version=2.0.0&request=GetCapabilities)
  2. Iterated through the dataset based on the Constraint Default Value:
    
    <ows:Constraint name="CountDefault">
    <ows:NoValues/>
    <ows:DefaultValue>500</ows:DefaultValue>
    </ows:Constraint>
ateucher commented 1 year ago

Thanks @bruce-tonkin. We have implemented your suggestion, however there appears to be a per-user rate limit set. 500 is a very low limit of features per request, requiring a lot of paging for even a moderate dataset. Our users (and myself in my testing) are hitting some rate limit quite quickly (A 503 http error), I think due to the high number of paged requests required. Looking at the geoserver docs here, it looks like any rate limits should be communicated to the user via a response header (e.g., X-Rate-Limit-*), however there is no indication in the response headers:

$ curl --head https://openmaps.gov.bc.ca/geo/pub/wfs
HTTP/1.1 400
Date: Wed, 27 Sep 2023 16:31:02 GMT
Server: Apache
X-Frame-Options: allow-from (null)
Content-Type: application/xml
Content-Length: 600
Access-Control-Allow-Origin: (null)
Access-Control-Allow-Credentials: true
Access-Control-Allow-Methods: POST, GET, OPTIONS, HEAD
Access-Control-Allow-Headers: X-Requested-With, Referer, Origin, Content-Type, SOAPAction, Authorization, Accept
Access-Control-Max-Age: 1000
X-Geo-Cache: miss
Connection: close
rustyjux commented 1 year ago

Hi @ateucher - another DataBC'er over here. We do indeed have the control flow module implemented on GeoServer.

I am looking into why the controlflow response headers are not being included. You can actually see them if you make requests to delivery.openmaps

X-Frame-Options: allow-from (null)
X-Rate-Limit-Context: wfs.getfeature
X-Rate-Limit-Limit: 2
X-Rate-Limit-Remaining: 1
X-Rate-Limit-Reset: Wed, 27 Sep 2023 17:23:05 GMT
X-Rate-Limit-Action: Reject excess requests
X-Control-flow-delay-ms: 0

For your reference, this is the current configuration for production GeoServer (openmaps.gov.bc.ca/geo). For specific request types, I believe this is the max requests per second

# hold/delay requests for up to 60 s
timeout=60

# don't allow the execution of more than 10 requests total in parallel
ows.global=10

ows.wms.GetCapabilities=1
ows.wms.GetMap=3
ows.wms.GetFeatureInfo=3
ows.wms.DescribeLayer=3

ows.wfs.GetCapabilities=1
ows.wfs.DescribeFeatureType=3
ows.wfs.GetFeature=3

# don't allow more than 4 outputs with Excel output as it's memory bound
ows.wfs.getfeature.application/msexcel=2

user=5
ip=5

I can't really speak to the origin of this config or how set in stone it is.

bruce-tonkin commented 1 year ago

Hi @ateucher I have just been on a call with @stephhazlitt and we are reviewing the contents of the controlflow properties file. However, based on the information above, this should allow you to have a clear picture of what is currently set.

smnorris commented 1 year ago

Thanks @bruce-tonkin @rustyjux - this is great info.

I'll hold off making this fix until we know what the limits will be going forward.

As I mentioned to @rustyjux on JIRA, a file-based option for automated downloads of many thousands of records would be great to minimize maintenance and systems loads - I'd much prefer to no longer maintain bcdata (python) and make zero WFS requests.

stephhazlitt commented 1 year ago

I am trying to understand these lines:

user=5
ip=5

👆These are

Where is the maximum number of requests a single user can execute in parallel.

bcdata is not doing anything in parallel (IIUC), so these should not be throttling

ows.wfs.GetCapabilities=1
ows.wfs.DescribeFeatureType=3
ows.wfs.GetFeature=3

👆Are these counts or max per second? I am trying to understand why we would see this new behaviour with the feature limit change, but never have seen it before.

ateucher commented 1 year ago

After issuing a few queries this morning with fewer than 500 records, I am now getting the 503 error, even issuing a query with resultType=hits (i.e., not even trying to download features).

Would it help to search your logs if you know the user-agent? We set it to "https://github.com/bcgov/bcdata/".

Also, we are generally issuing POST requests if that information helps at all

ateucher commented 1 year ago

don't allow the execution of more than 10 requests total in parallel ows.global=10

I am naïve on this, but this setting seems low if it is truly global (i.e., only process 10 total requests of 500 records at once)? Could we be getting queued, and then timing out because the queue is long?

ateucher commented 1 year ago

And in support of the timeout hypothesis, My requests appear to be failing at just over 60s.

ateucher commented 1 year ago

@stephhazlitt gave me a query of exactly 500 records to try, I ran it a few times (capturing the timings), it failed after 60s sometimes, and worked other times, fairly randomly:

$ curl -o /dev/null -v -s -w 'time_namelookup:  %{time_namelookup}s\ntime_connect:  %{time_connect}s\ntime_appconnect:  %{time_appconnect}s\ntime_pretransfer:  %{time_pretransfer}s\ntime_redirect:  %{time_redirect}s\ntime_starttransfer:  %{time_starttransfer}s\n----------\ntime_total:  %{time_total}s\n' https://openmaps.gov.bc.ca/geo/ows\?service\=WFS\&version\=2.0\&request\=GetFeature\&typeName\=pub:WHSE_WATER_MANAGEMENT.GW_WATER_WELLS_WRBC_SVW\&outputFormat\=application/json\&srsname\=EPSG:4326\&startIndex\=0\&count\=500\&sortBy\=WELL_TAG_NUMBER

*   Trying 142.34.140.41:443...
* Connected to openmaps.gov.bc.ca (142.34.140.41) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [323 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [21 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4513 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=CA; ST=British Columbia; L=Victoria; O=Government of the Province of British Columbia; CN=*.data.gov.bc.ca
*  start date: Sep 28 22:33:08 2022 GMT
*  expire date: Oct 16 22:33:08 2023 GMT
*  subjectAltName: host "openmaps.gov.bc.ca" matched cert's "openmaps.gov.bc.ca"
*  issuer: C=US; O=Entrust, Inc.; OU=See www.entrust.net/legal-terms; OU=(c) 2012 Entrust, Inc. - for authorized use only; CN=Entrust Certification Authority - L1K
*  SSL certificate verify ok.
* using HTTP/1.1
> GET /geo/ows?service=WFS&version=2.0&request=GetFeature&typeName=pub:WHSE_WATER_MANAGEMENT.GW_WATER_WELLS_WRBC_SVW&outputFormat=application/json&srsname=EPSG:4326&startIndex=0&count=500&sortBy=WELL_TAG_NUMBER HTTP/1.1
> Host: openmaps.gov.bc.ca
> User-Agent: curl/8.1.2
> Accept: */*
>
< HTTP/1.1 503
< Date: Wed, 27 Sep 2023 20:24:24 GMT
< Server: Apache
< X-Frame-Options: allow-from (null)
< X-Control-flow-delay-ms: 59999
< Content-Type: text/html;charset=utf-8
< Content-Language: en
< Content-Length: 1201
< Access-Control-Allow-Origin: (null)
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Methods: POST, GET, OPTIONS, HEAD
< Access-Control-Allow-Headers: X-Requested-With, Referer, Origin, Content-Type, SOAPAction, Authorization, Accept
< Access-Control-Max-Age: 1000
< X-Geo-Cache: miss
< Connection: close
<
{ [1201 bytes data]
* Closing connection 0
time_namelookup:  0.005707s
time_connect:  0.022221s
time_appconnect:  0.043487s
time_pretransfer:  0.043531s
time_redirect:  0.000000s
time_starttransfer:  60.063323s
----------
time_total:  60.063967s
stephhazlitt commented 1 year ago

I have a hypothesis. I wonder if dropping the feature limit from 10,000 to 500 means that the same number of users+queries or general traffic now takes much longer (with many more requests per job now), which means we are hitting this global threshold ows.global=10 when we were not before? And if the queue is longer than 60s (timeout=60) the query fails with a 503.

ateucher commented 1 year ago

Here was a successful one (same query):

$ curl -o /dev/null -v -s -w 'time_namelookup:  %{time_namelookup}s\ntime_connect:  %{time_connect}s\ntime_appconnect:  %{time_appconnect}s\ntime_pretransfer:  %{time_pretransfer}s\ntime_redirect:  %{time_redirect}s\ntime_starttransfer:  %{time_starttransfer}s\n----------\ntime_total:  %{time_total}s\n' https://openmaps.gov.bc.ca/geo/ows\?service\=WFS\&version\=2.0\&request\=GetFeature\&typeName\=pub:WHSE_WATER_MANAGEMENT.GW_WATER_WELLS_WRBC_SVW\&outputFormat\=application/json\&srsname\=EPSG:4326\&startIndex\=0\&count\=500\&sortBy\=WELL_TAG_NUMBER

*   Trying 142.34.140.41:443...
* Connected to openmaps.gov.bc.ca (142.34.140.41) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [323 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [21 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [4513 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=CA; ST=British Columbia; L=Victoria; O=Government of the Province of British Columbia; CN=*.data.gov.bc.ca
*  start date: Sep 28 22:33:08 2022 GMT
*  expire date: Oct 16 22:33:08 2023 GMT
*  subjectAltName: host "openmaps.gov.bc.ca" matched cert's "openmaps.gov.bc.ca"
*  issuer: C=US; O=Entrust, Inc.; OU=See www.entrust.net/legal-terms; OU=(c) 2012 Entrust, Inc. - for authorized use only; CN=Entrust Certification Authority - L1K
*  SSL certificate verify ok.
* using HTTP/1.1
> GET /geo/ows?service=WFS&version=2.0&request=GetFeature&typeName=pub:WHSE_WATER_MANAGEMENT.GW_WATER_WELLS_WRBC_SVW&outputFormat=application/json&srsname=EPSG:4326&startIndex=0&count=500&sortBy=WELL_TAG_NUMBER HTTP/1.1
> Host: openmaps.gov.bc.ca
> User-Agent: curl/8.1.2
> Accept: */*
>
< HTTP/1.1 200
< Date: Wed, 27 Sep 2023 20:22:56 GMT
< Server: Apache
< X-Frame-Options: allow-from (null)
< X-Control-flow-delay-ms: 0
< Content-Disposition: inline; filename=geoserver-GetFeature.application
< Content-Type: application/json;charset=UTF-8
< Set-Cookie: GS_FLOW_CONTROL=GS_CFLOW_47974d1c:18ad84b04f5:7452
< Access-Control-Allow-Origin: (null)
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Methods: POST, GET, OPTIONS, HEAD
< Access-Control-Allow-Headers: X-Requested-With, Referer, Origin, Content-Type, SOAPAction, Authorization, Accept
< Access-Control-Max-Age: 1000
< X-Geo-Cache: miss
< Transfer-Encoding: chunked
<
{ [7552 bytes data]
* Connection #0 to host openmaps.gov.bc.ca left intact
time_namelookup:  0.003802s
time_connect:  0.018224s
time_appconnect:  0.035458s
time_pretransfer:  0.035504s
time_redirect:  0.000000s
time_starttransfer:  0.534079s
----------
time_total:  1.081071s
bruce-tonkin commented 1 year ago

Looks like GeoServer and all the discussion here, has GeoServer in a pickle. It has recycled a number of times since noon today. If the number was rolled back, can I ask all those on this thread to give it a workout and determine if @stephhazlitt 's synopsis is correct.

stephhazlitt commented 1 year ago

Yep, same here, it took 3 tries for success. I am putting my eggs in my above comment re: the issue. A bit supported by the fact that the example docs use ows.global=100, so 10 seems very low imo.

shazlitt@NHM500154 ~ % curl -o /dev/null -v -s -w 'time_namelookup:  %{time_namelookup}s\ntime_connect:  %{time_connect}s\ntime_appconnect:  %{time_appconnect}s\ntime_pretransfer:  %{time_pretransfer}s\ntime_redirect:  %{time_redirect}s\ntime_starttransfer:  %{time_starttransfer}s\n----------\ntime_total:  %{time_total}s\n' https://openmaps.gov.bc.ca/geo/ows\?service\=WFS\&version\=2.0\&request\=GetFeature\&typeName\=pub:WHSE_WATER_MANAGEMENT.GW_WATER_WELLS_WRBC_SVW\&outputFormat\=application/json\&srsname\=EPSG:4326\&startIndex\=0\&count\=500\&sortBy\=WELL_TAG_NUMBER
*   Trying 142.34.140.41:443...
* Connected to openmaps.gov.bc.ca (142.34.140.41) port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* [CONN-0-0][CF-SSL] (304) (OUT), TLS handshake, Client hello (1):
} [323 bytes data]
* [CONN-0-0][CF-SSL] (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* [CONN-0-0][CF-SSL] (304) (IN), TLS handshake, Unknown (8):
{ [21 bytes data]
* [CONN-0-0][CF-SSL] (304) (IN), TLS handshake, Certificate (11):
{ [4513 bytes data]
* [CONN-0-0][CF-SSL] (304) (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* [CONN-0-0][CF-SSL] (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* [CONN-0-0][CF-SSL] (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=CA; ST=British Columbia; L=Victoria; O=Government of the Province of British Columbia; CN=*.data.gov.bc.ca
*  start date: Sep 28 22:33:08 2022 GMT
*  expire date: Oct 16 22:33:08 2023 GMT
*  subjectAltName: host "openmaps.gov.bc.ca" matched cert's "openmaps.gov.bc.ca"
*  issuer: C=US; O=Entrust, Inc.; OU=See www.entrust.net/legal-terms; OU=(c) 2012 Entrust, Inc. - for authorized use only; CN=Entrust Certification Authority - L1K
*  SSL certificate verify ok.
> GET /geo/ows?service=WFS&version=2.0&request=GetFeature&typeName=pub:WHSE_WATER_MANAGEMENT.GW_WATER_WELLS_WRBC_SVW&outputFormat=application/json&srsname=EPSG:4326&startIndex=0&count=500&sortBy=WELL_TAG_NUMBER HTTP/1.1
> Host: openmaps.gov.bc.ca
> User-Agent: curl/7.87.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 
< Date: Wed, 27 Sep 2023 20:38:40 GMT
< Server: Apache
< X-Frame-Options: allow-from (null)
< X-Control-flow-delay-ms: 0
< Content-Disposition: inline; filename=geoserver-GetFeature.application
< Content-Type: application/json;charset=UTF-8
< Set-Cookie: GS_FLOW_CONTROL=GS_CFLOW_-f74eee5:18ad85be68c:-41e4
< Access-Control-Allow-Origin: (null)
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Methods: POST, GET, OPTIONS, HEAD
< Access-Control-Allow-Headers: X-Requested-With, Referer, Origin, Content-Type, SOAPAction, Authorization, Accept
< Access-Control-Max-Age: 1000
< X-Geo-Cache: miss
< Transfer-Encoding: chunked
< 
{ [7551 bytes data]
* Connection #0 to host openmaps.gov.bc.ca left intact
time_namelookup:  0.008287s
time_connect:  0.208800s
time_appconnect:  0.462946s
time_pretransfer:  0.254334s
time_redirect:  0.000000s
time_starttransfer:  1.131660s
----------
time_total:  1.746482s
Bevann commented 1 year ago

Also running into these problems, even when trying for a single record from parks. What is a little strange is that shiny apps that use bcdata are still working.

rustyjux commented 1 year ago

Yea I see that we used to have ows.global = 100 and I'm not sure why it was dropped (server resources maxing out I'd wager). These configs I think would be for each geoserver instance (there are 4 for public) but when the requests are chopped up so small, I could see how 40 still isn't a lot.

Maybe go easy on the 'workout' - it would be nice to see a return to basic functionality before jumping into load testing.

smnorris commented 1 year ago

If the default total request load could be up to 4M features (4 instances 100 concurrent users 10k features) direct to BCGW with no caching layer somewhere I can see this issue being a bit tough to manage 😳

stephhazlitt commented 1 year ago

Also running into these problems, even when trying for a single record from parks. What is a little strange is that shiny apps that use bcdata are still working.

@Bevann There could be some session caching maybe 🤷‍♀️ Or Shiny is only pulling data with a user click and you have been lucky so far 🤞

rustyjux commented 1 year ago

WFS has been up and down the last two hours so could have been luck

ateucher commented 1 year ago

Woot! Certainly looking better. Just ran three queries, with up to 1040 features, worked fast and without failure.

It didn't require paging so you must have bumped the feature limit?

Maybe go easy on the 'workout' - it would be nice to see a return to basic functionality before jumping into load testing.

Oops, sorry, saw this 👆 too late!

smnorris commented 1 year ago

All Python bcdata tests currently pass, and run in expected timeframe.

gcperk commented 1 year ago

I did a small test and all looking great (and super fast!!) I will hold off on any more downloads for a little bit! HUGE thanks to @ateucher, @stephhazlitt, @bruce-tonkin, @rustyjux, and @smnorris for the quick trouble shooting.

Perhaps a basic question which may have been answered above (sorry ...... all the ows code was above my head!) - is there still a daily limit on downloads per user ? ie should I expect to run into more daily time out 503 errors?

THANK_YOU!

ateucher commented 1 year ago

is there still a daily limit on downloads per user ? ie should I expect to run into more daily time out 503 errors?

I don't think there ever was, it was simply that the server was getting overwhelmed and causing timeouts from time to time.

bruce-tonkin commented 1 year ago

The feature limit has been reset back to the original value of 10000. However, if we run into database problems again, will have to revisit this issue.

rustyjux commented 1 year ago

There are no daily limits - I think the limits are all per second.

stephhazlitt commented 1 year ago

Big thanks @bruce-tonkin and @rustyjux 🙌.

ateucher commented 1 year ago

Thanks so much @bruce-tonkin @rustyjux for your engagement on this - very much appreciated!

I'm going to close this now! 🍾 Thanks All!

bruce-tonkin commented 1 year ago

I have my fingers crossed. Hope this works.