archesproject / arches-koop

Arches Koop application
1 stars 1 forks source link

Improved ESRI client query support #19

Closed aj-he closed 2 years ago

aj-he commented 2 years ago

As it stands, the koop services requests all data from the Arches GeoJSON api and can lead to long wait times or timeouts when requesting data from models with a lot of geometries.

The service needs to be able to pass more of the geoservice query parameters to the Arches GeoJSON api so that the data can be streamed back as required.

njkim commented 2 years ago

Have you tried adding parameter kv pair in the development.json? https://github.com/archesproject/arches-koop/blob/master/README.md?plain=1#L18 It "should" work with all the json endpoint parameters.

aj-he commented 2 years ago

Hi @njkim. Yes, we are using a config to define the nodeid and nodegroups to pull back a subset of node data. One of our datasets has ~57,000 geometries and generating the GeoJSON object to contain this data takes on average 4m55s on a high spec dev box with no other processes running on it.

While we have configured the cache, it takes so long to generate the cache in the first place (or refresh it if it has expired) that the users are left waiting an unacceptable amount of time.

Also, for ArcGIS Pro projects with all configured layers in place, if you open the project and the cache has expired then the project layers will fail.

If they were non-dynamic layers then I could regenerate the cache nightly no-problem, but users are editing the layers using the Arches ESRI add-in and will want to see the mapping updating in the client within a relatively short space of time.

I am working on supporting a number of the ESRI client request parameters via pass-through, rather than relying on a complete cache of the data.

aj-he commented 2 years ago

I found that to make things more performant that we needed to use pass-through and avoid the cache capability. This meant that a lot of functionality needed to be written into the GeoJSON api and this became very complex.

To simplify things, I am instead going to investigate pg_featureserv as an alternative for large datasets. Closing this ticket for now.