koopjs / winnow

Deprecated
Apache License 2.0
90 stars 18 forks source link

Avoid ObjectID collision in subsequent requests from a given client. #67

Closed rgwozdz closed 6 years ago

rgwozdz commented 6 years ago

In response to https://github.com/koopjs/FeatureServer/issues/82

If a feature does not already have an id field, winnow assigns one. It has, up to this point, used the iterator from a loop for this assignment. The problem with this is that two different features may get assigned the same ID on separate requests (e.g., panning the map which filters with bounding box). And since clients may cache data according to IDs, this can lead to faulty rendering (e.g., point doesn't get rendered because it has an ID present in the cache).

The alternative presented here is to create a random numeric prefix to that iterator on each request. ArcGIS Object IDs have a limit of 2147483647. So to create a prefix, I trim digits from that limit based on the length of the maximum iterator value, subtract one from the result, and use that value as a range to select a random number. The result become a prefix to the loop's iterator, which is assigned as the value of the feature's Object ID.

jptheripper commented 6 years ago

Be careful with this approach. Clients, such as WAB, use objectid to make requests when using the attribute table. So if you assign an objectid to a feature, you have to be able to retrieve that feature by objectid at a latter time.

To address this i used farmhash32 to hash the feature. That generates a 32bit number that is ESRI compatible. I then have a json object that stores the hash and feature key in a key:value pair so i can look up later.

rgwozdz commented 6 years ago

@jptheripper - thanks for the comment. That sounds like a problem. Given that, this isn't ready

rgwozdz commented 6 years ago

@jptheripper - Is this what you are referring to? https://www.npmjs.com/package/farmhash. Unfortuntately that produces a 32bit unsigned integer, and as such the value can be larger then ArcGIS allows for OBJECTID (2,147,483,648).

jptheripper commented 6 years ago

yes, i was using the hash32 method which returns a 32bit int. The max on a 32bit int should be 2,147,483,647 (2^31 − 1) if it were signed, which is my mistake. i wonder if that is causing the problem. thanks for catching that

rgwozdz commented 6 years ago

hash32 won't return a negative value because it returns unsigned 32bit ints. Unfortunately, that means it has a range of 0 - 4,294,967,295, which is greater than the ArcGIS's OBJECTID limit of 2,147,483,647 (see bottom of page at http://desktop.arcgis.com/en/arcmap/10.3/manage-data/tables/fundamentals-of-objectid-fields.htm).

jptheripper commented 6 years ago

I have been thinking about this. A random number or sequence would be perfectly fine as long as you store the number as a key and have the ability to look up the actual value. For cases with no unique value in the data the api would need a way to look up by something to get a singular object. And koop seems to need either Json object that can handle thousands to millions of keys or a lightweight database. That could even avoid the random conflict of hashing where two complicated objects generated the same hash value.

On Fri, Apr 6, 2018, 4:58 PM Rich Gwozdz notifications@github.com wrote:

hash32 won't return a negative value because it returns unsigned 32bit ints. Unfortunately, that means it has a range of 0 - 4,294,967,295, which is greater than the ArcGIS's OBJECTID limit of 2,147,483,647 (see bottom of page at http://desktop.arcgis.com/en/arcmap/10.3/manage-data/tables/fundamentals-of-objectid-fields.htm ).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/koopjs/winnow/pull/67#issuecomment-379376467, or mute the thread https://github.com/notifications/unsubscribe-auth/AKEroNCUheOsqxcAwAWF_1xijUM0Pfy3ks5tl9cKgaJpZM4TKHVh .

jptheripper commented 6 years ago

Sequence works. fixes issue in Pro, now on to next issue

rgwozdz commented 6 years ago

Cool, glad to here it. I've been thinking this through as well. I'm thinking of creating the sequence for datasources that need it, but doing it here-ish: https://github.com/koopjs/koop-core/blob/master/src/models/index.js#L12.

dmfenton commented 6 years ago

@rgwozdz If you're going to implement this in core, the solution needs to be generalized beyond what geoservices requires. I'd like to know more about your rationale.

rgwozdz commented 6 years ago

@dmfenton definitely hoping for input here. My thinking was we need to store OBJECTIDs in the cached data to support Pro features like "identify" and the attribute table's "zoom to" or "select." The "identify" feature uses OBJECTID from a feature on the map in a WHERE query back to the server. For that to work, the OBJECTIDs have to be persisted rather than issued anew on every request. For the "zoom to" feature to work on a table-row click, the OBJECTID of the row has to be found on the map. But unless the same feature-OBJECTID pairings are sent across requests, this might not happen (e.g., table gets loaded from a query with one set of IDs, but map gets loaded with a different query which has a different set of IDs).

Creating a feature hash as an OBJECTID would be an elegant way to solve this, but it can create out-of-range OBJECTIDs (signed vs unsigned 32bit ints).

So one alternative is to create the ArcGIS Pro-compliant OBJECTIDs prior to the cache upsert operation. This would mean that follow-up queries from clients could use WHERE filters on OBJECTIDs and that the attribute table would stay in sync with the map for as long and the cache was valid. You could possibly even maintain OBJECTIDs across different upserts with some sophisticated logic.

As you note, the downside is that this approach creates and uses OBJECTIDs in koop-core, where many koop implementations might not need them. So perhaps there is a configuration and generalization approach that could be applied? Like an option to "create signed-int compliant feature IDs on upsert"?

UPDATE: Might be a better option to add the OBJECTID creation inside the pull callback function of koop-output-geoservices; would just need to add logic and flags to only execute it when the cache is about to get refreshed.

jptheripper commented 6 years ago

Thank goodness you guys are handling this. Because my attempt managed to produce a Koop service that crashes Pro.

https://www.mymanatee.org/giskoop/mapillary/detections/FeatureServer/0

On Mon, Apr 9, 2018 at 9:08 AM, Rich Gwozdz notifications@github.com wrote:

@dmfenton https://github.com/dmfenton definitely hoping for input here. My thinking was we need to store OBJECTIDs in the cached data to support Pro features like "identify" and the attribute table's "zoom to" or "select." The "identify" feature uses OBJECTID from a feature on the map in a WHERE query back to the server. For that to work, the OBJECTIDs have to be persisted rather than issued anew on every request. For the "zoom to" feature to work on a table-row click, the OBJECTID of the row has to be found on the map. But unless the same feature-OBJECTID pairings are sent across requests, this might not happen (e.g., table gets loaded from a query with one set of IDs, but map gets loaded with a different query which has a different set of IDs).

Creating a feature hash as an OBJECTID would be an elegant way to solve this, but it can create out-of-range OBJECTIDs (signed vs unsigned 32bit ints).

So one alternative is to create the ArcGIS Pro-compliant OBJECTIDs prior to the cache upsert operation. This would mean that follow-up queries from clients could use WHERE filters on OBJECTIDs and that the attribute table would stay in sync with the map for as long and the cache was valid. You could possibly even maintain OBJECTIDs across different upserts with some sophisticated logic.

As you note, the downside is that this approach creates and uses OBJECTIDs in koop-core, where many koop implementations might not need them. So perhaps there is a configuration and generalization approach that could be applied? Like an option to "create signed-int compliant feature IDs on upsert"?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/koopjs/winnow/pull/67#issuecomment-379745705, or mute the thread https://github.com/notifications/unsubscribe-auth/AKEroAzx24RKWwjf7XG0KW3G0wc0uFqlks5tm11ngaJpZM4TKHVh .

jptheripper commented 6 years ago

disregard my wild goose chase.

jptheripper commented 6 years ago

I can confirm this solves 95% of my issues. Using a sequential Objectid with a lookup table, and disabling the cache, fixes rendering in Pro. The Attribute table also works

The functionality I am currently missing is popups/labels. Clicking on a feature generates a query, koop returns a single feature in a featurecollection, but i never see the popup. And i cannot seem to label. But massive progress