Closed carocad closed 7 years ago
a simple data exploration shows that most nodes in our current saarland graph have either one or two edges
(def graph (time (alg/biggest-component (time (osm/osm->graph "resources/osm/saarland.osm")))))
(into (sorted-map) (frequencies (map (comp #(apply + %)
#(map count %)
(juxt rp/predecessors rp/successors)
second)
graph)))
;; absolute values of {number-of-edges frequency}
;; => {1 14394, 2 380223, 3 64308, 4 8600, 5 455, 6 84, 7 25, 8 10, 9 2}
(into (sorted-map)
(map (fn [[k v]] [k (int (* 100 (/ v (count graph))))])
(frequencies (map (comp #(apply + %)
#(map count %)
(juxt rp/predecessors rp/successors)
second)
graph))))
;; percentage values of {number-of-edges percentage}
;; => {1 3, 2 81, 3 13, 4 1, 5 0, 6 0, 7 0, 8 0, 9 0}
Currently our graph representation is very rudimentary. It takes the nodes and ways of an OSM file and simply creates edges between nodes according to the sequence in the
way
elements. The problem with this approach is that it is very inefficient since lots of nodes exists only to represent geographical coordinates.A better approach would be one like the one used in OSRM: https://github.com/Project-OSRM/osrm-backend/wiki/Graph-representation
Although I dont completely agree with the fully expanded graph representation that they use. I think that simplifying the connections between nodes would greatly improve the current algorithm both in speed and memory. It has the added benefit that it would be easier to couple with the changes necessary for #28 since geo-coordinates dont need network metadata as opposed to nodes in the graph.