Have a proper graph solver

CJ-Wright commented 5 years ago

Currently we have half of a graph solver via the linker function. The linker function takes in a namespace hands it to the pipeline chunk function and then appends the output to the current namespace, and repeats. This allows us to solve the graph, attaching all the nodes together in the correct order, if we have the correct order for the links to be done. If the order is incorrect then we run into problems, as we will try to use parts of the namespace which don't exist yet. This is sub-optimal as it requires that we know the order at any point in time, which might not be true (eg. if we start composing pipelines dynamically we might not know the order ahead of time). The proper solution to this requires having for each pipeline chunk the required inputs and the outputs. The inputs are easy, they are the args to the function so we can go and look them up. The outputs are less so. Since we use the locals() function to avoid writing the explicit outputs we don't have any actual access to them until the function is run.

A potential solution to this problem is to spoof the namespace. By passing in dummy Stream objects we can see exactly what the output of the chunk is allowing us to get the output part of the chunk. With both the chunk inputs (via inspect.getargspec) and the outputs we can then get the correct order in which the nodes need to be created via chunks.

eg

def inspect_chunk(chunk)
    args = inspect.getargspec(chunk)[0]
    ns = chunk(**{k: Stream() for k in args})
    streams = [k for k in ns if isinstance(ns[k], Stream) and k not in args]
    return args, streams

Note that one could potentially call these chunks metanodes.

Note that this means each input/output node needs to have a unique name, which might not be a bad thing. (This will produce problems for the tomo system, but we were going to have problems on that front anyway, since it really is a metanode factory-like thing.

CJ-Wright commented 5 years ago

Note that this assumes that all the nasty bits of the pipeline management (eg order of operations) are inside each chunk, since we can't inspect that out.

sbillinge commented 5 years ago

this seems very elegant. :+1:

On Thu, Dec 6, 2018 at 11:31 AM Christopher J. Wright < notifications@github.com> wrote:

Note that this assumes that all the nasty bits of the pipeline management (eg order of operations) are inside each chunk, since we can't inspect that out.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xpdAcq/rapidz/issues/18#issuecomment-444935739, or mute the thread https://github.com/notifications/unsubscribe-auth/AEDrUbUG2z8IYHEAcS10hFIWWx4RYvKRks5u2UZqgaJpZM4ZG23U .

CJ-Wright commented 5 years ago

I think this is a thing for the middle of next cycle. We might not need it to get the pipeline dispatch system off the ground to begin with (we'll be very strict about keeping the pipeline order). But once this is in we will be able to be more loose about the order and just let the computer figure it out.

xpdAcq / rapidz

Have a proper graph solver #18