tf-encrypted / moose

Secure distributed dataflow framework for encrypted machine learning and data processing
Apache License 2.0
57 stars 15 forks source link

PyMoose: better names for outputs? #1088

Closed mortendahl closed 2 years ago

mortendahl commented 2 years ago

A computation like the following currently results in an output op named op_3:

@pm.computation
def my_computation():
    alice = pm.host_placement("alice")
    bob = pm.host_placement("bob")
    carole = pm.host_placement("carole")

    with alice:
        x = pm.constant(np.array([1., 2.], dtype=np.float64))

    with bob:
        y = pm.constant(np.array([3., 4.], dtype=np.float64))

    with carole:
        z = pm.add(x, y)

    return z

Is there any way we could provide better names for outputs, or at least ensure ordering?

For instance:

@pm.computation(pm.Output("z"))
def my_computation():
    ...
    return z

or

@pm.computation
def my_computation():
    ...
    return pm.output(z, "z")
mortendahl commented 2 years ago

Since computations are rewritten and operations renamed, this might require adding an attribute on output operations that can be used to reconstruct eg the order specified in PyMoose.

jvmncs commented 2 years ago

I think the "right" way to preserve output ordering would be to preserve the op names from the ASTTracer. this could be achieved if we were to implement name scoping for ops, e.g. so that lowered ops could try to maintain a record of the higher-level ops they came from via their name. this would be consistent with what we saw in TensorFlow/TF Encrypted. an example for replicated addition might be logical level add_0 turning into a subgraph of ops with names like add_0/share_0/ring_add_0, add_0/replicated_add_0/ring_add_0, etc. (these examples are not meant to be correct, just for illustrative purposes)

jvmncs commented 2 years ago

but I admit name scoping might be an ambitious goal for right now, so think your suggestion for an explicit pm.output makes sense