adamfranco / curvature

Find roads that are the most curvy or twisty based on Open Street Map (OSM) data.
http://roadcurvature.com/
225 stars 39 forks source link

Refactor stage 1: collect and "Join" #21

Closed adamfranco closed 8 years ago

adamfranco commented 8 years ago

This is a sub-task of #20, Refactor collecting, preprocessing, and curvature calculation.

Stage 1: Collect and "join" This first stage would pretty much the raw OSM data, but with the coordinates in-lined and the joined ways ordered in a collection that can be examined as a unit. The highway types to include would probably be the main key for avoiding including buildings/boundaries/sidewalks/hiking-paths/etc. Here's an example of what this data might look like:

    { # Overall collection/stream of items, order doesn't matter.
      [ # One item, an ordered sequence of joined ways where the last
        # node/point of each way is the same as the first node/point of the next.
        {
          'id': 19730334,
          'tags': {
            'name': 'Northwood Drive',
            'surface': 'paved',
            'type': 'residential',
            # ... other tags on the way ...
            'county': 'Washington, VT',
          },
          'coords': [
            {
              'id': 1000,
              'lat': -72.29485699999982,
              'lon': 43.78561799999995,
            },
            # ...,
            {
              'id': 1020,
              'lat': -72.2948550000000,
              'lon': 43.78561800000000,
            },

          ]
        },
        {
          'id': 19730336,
          'tags': {
            'name': 'Northwood Drive',
            'surface': 'unknown',
            'type': 'residential',
            # ... other tags on the way ...
            'county': 'Washington, VT',
          },
          'coords': [
            {
              'id': 1020,
              'lat': -72.2948550000000,
              'lon': 43.78561800000000,
            },
            {
              'id': 1021,
              'lat': -72.2948550000011,
              'lon': 43.78561800000012,
            },
            # ...
          ]
        },
        # ...
      ],
      # ...
    }
Fonsan commented 8 years ago

I can have a go at this, but perhaps this feature is "clean" enough to warrant extracting into a separate project eventually, I am guessing many projects could benefit from this functionality. It would be extremely cool if the output could be a ordered .pbf file besides msgpack

adamfranco commented 8 years ago

I was thinking the same thing, Eric. 😉 That said, the "joining" aspect of this is a little more particular to the curvature use-case even if in-lining coord-data would be broadly useful. My preference right now is to get it working with msgpack so that the rest of the processing chain can be refactored, then later on extract it to a stand-alone thing if that is useful.

adamfranco commented 8 years ago

This is now working in a basic way, however I'd like to add unit tests to validate operation of the algorithm a bit more so that I can feel confident about changes not breaking anything.

There are also a few edge cases (like indeterminate joining at forks, circular ways) that would be good to test.

Fonsan commented 8 years ago

I suggest you create a pull request for refactor-collector which would enable us give feedback on the changes

adamfranco commented 8 years ago

👍 See #27 for the PR.

adamfranco commented 8 years ago

Note to self. To make the collector more testable, the OSMParser instantiation should be reworked as an injected dependency so that we can pass it a mocked parser.

adamfranco commented 8 years ago

As of d2ef660 the collector is now testable and has been validated with the Vermont extract from geofabrick.de.