gbif / gbif-api

GBIF API
Apache License 2.0
27 stars 5 forks source link

Support very long (≥8k) search requests #35

Open MattBlissett opened 5 years ago

MattBlissett commented 5 years ago

Users would like to search using long polygons or many taxon keys. To support this with the current search API, a long (>8k character) URL must pass through:

  1. The user's web browser or other client
  2. Potentially a not-very-good proxy (corporate or education filter etc)
  3. Varnish
  4. a. occurrence-ws b. vectortile-ws / mapnik-server
  5. SOLR

4.a. is easily fixed for gbif-microservice, 4.b. can be fixed for Dropwizard with

  applicationConnectors:
    - type: http
      port: 7001
      maxRequestHeaderSize: 1MiB

although there are then issues somewhere in Jersey's regex handling.

  1. is probably OK since using HTTPS should avoid most proxies from modifying the request

  2. requires regexes in Varnish to use .*? rather than .* for the maps rules, and there's a related note in Varnish saying needing this is "madness".

That leaves 1. That's a concern from a Jetty developer suggesting all of this is a bad idea, for compatibility and security.

So we need some way to communicate the search terms without using >8kiB, at least for website and API. We could:

MortenHofft commented 4 years ago

I too have looked at c and d and concluded it wouldn't work for us. I ended up thinking that b was the only reasonable way to do so: