digidem / osm-p2p-server

Peer-to-peer OpenStreetMap API v0.6 Server for osm-p2p-db
BSD 2-Clause "Simplified" License
87 stars 10 forks source link

Fix a "ghost points" scenario #27

Closed hackergrrl closed 7 years ago

hackergrrl commented 7 years ago

This happens where a map query is made with 'forks' disabled, meaning only the latest version of each OSM element is returned.

In some cases an OSM 'way' that is excluded will reference OSM nodes that are not present in the way that is ultimately returned. Prior to this commit, those ways' unreferenced nodes would be returned regardless, resulting in the final map being peppered with unlabeled "ghost points", with no way referencing them.

This fixes the issue by tracking which nodes are referenced to by ways that are excluded, and filtering them out of the final result.

hackergrrl commented 7 years ago
  1. Moved the filter piece to the API layer.
  2. Moved the filter logic into its own module in lib/.
  3. Added unit tests for (2).
  4. Simplified filtering logic to hopefully be more performant.
gmaclennan commented 7 years ago

I don't know that much, other than having a PR comment from MapBox for including delete. The issue I think is with using it in a loop, so performance impacts can add up. There is this:

http://stackoverflow.com/questions/27397034/why-is-delete-slow-in-javascript

And this with good discussion:

https://github.com/google/google-api-nodejs-client/issues/375

I think because this object could be very large for a large forked way, and could be looped hundreds of times, this is a time where it is worth considering this kind of performance

On Jan 11, 2017, at 12:35 AM, Stephen Whitmore notifications@github.com wrote:

@noffle commented on this pull request.

In routes/misc_map.js https://github.com/digidem/osm-p2p-server/pull/27:

+

  • // Note that all of the nodes referenced by this way should be culled.
  • if (element.type === 'way') {
  • element.nodes.forEach(function (ref) {
  • excludeNodeRefs[ref] = true
  • })
  • }
  • return false
  • }
  • })
  • // Remove excluded entries that appear in the keep entries.
  • Object.keys(keepNodeRefs).forEach(function (ref) {
  • if (excludeNodeRefs[ref]) {
  • delete excludeNodeRefs[ref]

Removed; though I'm still curious to understand the performance concerns.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/digidem/osm-p2p-server/pull/27, or mute the thread https://github.com/notifications/unsubscribe-auth/AARumRcRCLyXXTHMH2iqafh4_sGAC9npks5rRCO6gaJpZM4Le9US .

hackergrrl commented 7 years ago

I added some more tests to cover the scenario where there are two forks to a way: one modifies a point, one deletes a point.

gmaclennan commented 7 years ago

Sorry to report bad news... Unfortunately there seems to be some kind of race condition here, and test/ghost_points.js fails about 1 in 3 times:

not ok 19 should be equal
  ---
    operator: equal
    expected: 2
    actual:   3
    at: ConcatStream.<anonymous> (/Users/gregor/Dev/DdDev/osm-p2p-server/node_modules/concat-stream/index.js:36:43)
  ...
not ok 20 should be equivalent
  ---
    operator: deepEqual
    expected: |-
      [ '2864043637183826736', '9003328637408641236' ]
    actual: |-
      [ '2864043637183826736', '5274879986653545291', '9003328637408641236' ]
    at: ConcatStream.<anonymous> (/Users/gregor/Dev/DdDev/osm-p2p-server/node_modules/concat-stream/index.js:36:43)
  ...
ok 21 undefined

Unfortunately setting env OSM_P2P_DB_DELAY=200 does not seem to reliably reproduce this bug, which it is supposed to.

gmaclennan commented 7 years ago

I added slowdb() to the tests but still not reliably reproducing. This race condition may be coming from somewhere else, maybe in the test setup itself.

gmaclennan commented 7 years ago

Still getting random errors...

Without fully parsing your code, I suspect that this may be due to unstable sorting of forks because you are creating points directly (i.e. not with the osm-server api) without timestamps, so the sorting will be based on version numbers rather than timestamps.

gmaclennan commented 7 years ago

Ok, adding timestamps fixes this. I'm going to merge and cut a release so we can work with this today, but @noffle can you review this test once more to check I'm not missing anything with this fix?