adamfranco / curvature

Find roads that are the most curvy or twisty based on Open Street Map (OSM) data.
http://roadcurvature.com/
225 stars 39 forks source link

Generators vs Callbacks #33

Open Fonsan opened 8 years ago

Fonsan commented 8 years ago

I did mention before in https://github.com/adamfranco/curvature/pull/13#issuecomment-216120687 and https://github.com/adamfranco/curvature/pull/13#issuecomment-216122647 that utilising pythons generators and the itertools module had some drawbacks with its "pull" design. Essentially when fanning out when using itertools.tee their is now flow control and one stream will get ahead of the other causing itertools.tee to keep the not yielded objects in memory. However if one does not fan out the code comes together rather nicely.

The alternative is using callbacks like the OSMParser does OSMParser.parse(ways_callback=self.ways_callback). Callbacks are more flexible by allowing messages to be passed back as return values which could prove useful. Using callbacks differs from generators since they have a "push" design, you need to setup you full chain before calling parse. It is still possible to express the same chains using a callback approach.

I will experiment constructing a nice callback chain

Fonsan commented 8 years ago

I have added a example https://github.com/Fonsan/curvature/blob/3c0f62d03d0e60d89ec7e0b2b290ec654ab9416b/processing_chains/callback_chain_example.py

The next step is to express a tree with multiple branches that is supported by callbacks with a minimal memory footprint but would unusable in the current state with the iterator chain because of memory usage

Fonsan commented 8 years ago

Here is a working example of a Fanout solution


class CallbackedProcessor(object):
  def __init__(self, processor, callback):
    self.processor = processor
    self.callback = callback

  def input(self, arg):
    # Since processors do not have a process_item interface we are mocking it for now
    for output in self.processor.process([arg]):
      self.callback(output)

class IDOutputPrinter(object):
  def __init__(self, id):
    self.id = id

  def process(self, iterable):
    for collection in iterable:
      for way in collection:
        print self.id, way['id']
    return []

class FanOut(object):
  """docstring for FanOut"""
  def __init__(self):
    super(FanOut, self).__init__()
    self.callbacks = []

  def callback(self, arg):
    for cb in self.callbacks:
      cb(arg)

def link_callbacks(chain, callback = lambda collection: collection):
  for processor in reversed(chain):
    callback = CallbackedProcessor(processor, callback).input
  return callback

normal_roads = FilterOutWaysWithTag('service', ['driveway', 'parking_aisle', 'drive-through', 'parking', 'bus', 'emergency_access'])
soft_roads = FilterOnlyWaysWithTag('surface', ['unpaved','dirt','gravel','fine_gravel','sand','grass','ground','pebblestone','mud','clay','dirt/sand','soil'])
non_soft_roads = FilterOutWaysWithTag('surface', ['unpaved','dirt','gravel','fine_gravel','sand','grass','ground','pebblestone','mud','clay','dirt/sand','soil'])

chain = [
  AddSegments(),
  AddSegmentLengthAndRadius(),
  AddSegmentCurvature(),
  FilterSegmentDeflections(),
  SplitCollectionsOnStraightSegments(2414),
  AddWayLength(),
  AddWayCurvature(),
  FilterCollectionsByCurvature(min=300),
]

f1 = FanOut()

start_callback = link_callbacks([normal_roads], f1.callback)

soft_chain = [
  soft_roads,
] + chain + [
  IDOutputPrinter('soft')
]

non_soft_chain = [
  non_soft_roads,
] + chain + [
  IDOutputPrinter('non_soft') 
]

f1.callbacks.append(link_callbacks(soft_chain))
f1.callbacks.append(link_callbacks(non_soft_chain))

for collection in msgpack.Unpacker(sys.stdin, use_list=True):
  start_callback(collection)
Fonsan commented 8 years ago

As I realised here https://github.com/adamfranco/curvature/pull/34#issuecomment-219293457 without a proper abort signal convention by explicitly returning some value like True, post processors such as Head will not work when moving to callbacks