tf-encrypted / moose

Secure distributed dataflow framework for encrypted machine learning and data processing
Apache License 2.0
58 stars 15 forks source link

Discussion: Usefulness of untyped computation format #743

Open mortendahl opened 2 years ago

mortendahl commented 2 years ago

We currently have two computation formats: moose::Computation with its typed operators, and the textual as defined in moose::textual. The former could be seen as primarily our in-memory representation although we current use it for more. We also have the (legacy?) format used when ingesting a computation from the eDSL.

This issue is about discussing whether we need an extra format with untyped operators that is closer to ONNX and TensorFlow graphs (and along the lines of how MsgPack serializes computations from the eDSL). Concretely, instead of having a separate struct for each operator, we could have a single struct with eg a type field and a Map<String, String> of attributes. Likewise, it would use untyped placements. The motivation behind this format would be to serve as an easier entity to represent in other languages/systems (GraphQL, Go, Python, JavaScript, etc), for the purpose of visualization, external analysis, etc.

Whether we'd want to both export and import with this format is open for discussion. Also open how this format fits together with the other formats, and whether it could be an intermediate step in importing eg ONNX. My first impression is that it makes sense to keep moose::Computation around as at least our in-memory format since this one allows rustc to perform consistency checks of supported operator instantiations.

mortendahl commented 2 years ago

Any thoughts on this @voronaam?