Simplify graph representation

hiposfer / kamal

A routing engine service using Open Street Maps and GTFS as data source

GNU Lesser General Public License v3.0

11 stars 1 forks source link

Currently our graph representation is very rudimentary. It takes the nodes and ways of an OSM file and simply creates edges between nodes according to the sequence in the way elements. The problem with this approach is that it is very inefficient since lots of nodes exists only to represent geographical coordinates.

A better approach would be one like the one used in OSRM: https://github.com/Project-OSRM/osrm-backend/wiki/Graph-representation

Although I dont completely agree with the fully expanded graph representation that they use. I think that simplifying the connections between nodes would greatly improve the current algorithm both in speed and memory. It has the added benefit that it would be easier to couple with the changes necessary for #28 since geo-coordinates dont need network metadata as opposed to nodes in the graph.

a simple data exploration shows that most nodes in our current saarland graph have either one or two edges

(def graph (time (alg/biggest-component (time (osm/osm->graph "resources/osm/saarland.osm")))))
(into (sorted-map) (frequencies (map (comp #(apply + %)
                                           #(map count %)
                                           (juxt rp/predecessors rp/successors)
                                           second)
                                     graph)))
;; absolute values of {number-of-edges frequency}
;; => {1 14394, 2 380223, 3 64308, 4 8600, 5 455, 6 84, 7 25, 8 10, 9 2}
(into (sorted-map)
      (map (fn [[k v]] [k (int (* 100 (/ v (count graph))))])
           (frequencies (map (comp #(apply + %)
                                   #(map count %)
                                   (juxt rp/predecessors rp/successors)
                                   second)
                             graph))))
;; percentage values of {number-of-edges percentage}
;; => {1 3, 2 81, 3 13, 4 1, 5 0, 6 0, 7 0, 8 0, 9 0}

A single edge is mainly due to a way in one direction
a double edge can simply be a connection between a geographical point with its neighbours or between an actual node in the graph to other geographical points

hiposfer / kamal

Simplify graph representation #35