joschu / cgt

Computation Graph Toolkit
Other
628 stars 87 forks source link

Ability to run execution graph from an external application #31

Open dm0 opened 9 years ago

dm0 commented 9 years ago

Please add ability to run compiled with native backend CGT graph from C or C++ application.

Intended pipeline:

  1. Create computation graph with CGT
  2. Build it using native backend
  3. Link result object files with external application
  4. Execute graph.

Thank you.

dm0 commented 9 years ago

I guess that it is already possible somehow, as you do communicate with native binary from python. Could you point me to the right place to look for code that does this communication? If it is possible to save CGT-generated C/C++ sources this probably could help me with understanding how to run graph from external application.

Thanks.

hojonathanho commented 9 years ago

We're working on this, stay tuned.

hojonathanho commented 9 years ago

@joschu What do you think about cross-language execution graph serialization, with JSON or protobuf? Right now execution graphs are given to the C++ interpreter via Cython reading Python objects, but for this case we want a standalone C++ library to be able to read a serialized execution graph. This will introduce a dependency into the C++ side (for reading the serialized representation with a JSON library, for example), and might also introduce it to the Python side (if we use protobuf, which is not installed by default).

dm0 commented 9 years ago

I'm sorry to bother you... Here is my two cents:

MessagePack (http://msgpack.org/) also looks quite prominent for binary serialization, though I don't have any experience with it.

joschu commented 9 years ago

Jonathan, I like the idea of having cross-language serialization using some standard format.

One constraint is that we'll need to serialize numerical array data. This would be kludgy in JSON. Protobuf would be an OK choice, though it might be a bit too heavyweight.

If I remember correctly, @pcmoritz also recommended msgpack At first glance, it looks like a good choice to me.

delip commented 9 years ago

HDF5 is a good option and is already used in many Python projects.

On Tue, Sep 22, 2015 at 1:19 PM, John Schulman notifications@github.com wrote:

Jonathan, I like the idea of having cross-language serialization using some standard format.

One constraint is that we'll need to serialize numerical array data. This would be kludgy in JSON. Protobuf would be an OK choice, though it might be a bit too heavyweight.

If I remember correctly, @pcmoritz https://github.com/pcmoritz also recommended msgpack At first glance, it looks like a good choice to me.

— Reply to this email directly or view it on GitHub https://github.com/joschu/cgt/issues/31#issuecomment-142354242.

hojonathanho commented 9 years ago

I was just referring to serialization of instructions/execution graph. This shouldn't involve serializing binary data, right?

joschu commented 9 years ago

The ReturnByRef and ReturnByVal instructions have to store the closure data for the Ops they're associated with. And a Constant Op has a value associated with it, so it'll be necessary to serialize the data.

hojonathanho commented 9 years ago

I have started implementing serialization for execution graphs here: https://github.com/hojonathanho/cgt/tree/serialization It's incomplete for now, but cgtArrays can be serialized.

I chose a C++-only serialization framework because I think that with the current way things are set up, it's best for serialization/deserialization to happen in C++/Cython only -- we should never be constructing execution graphs in Python anyway. It's also a lot faster to not have to worry about cross-language compatibility, since we can effectively directly serialize the bits in structs. Let me know if you think this is the right way to go.

joschu commented 9 years ago

Agreed that that it makes sense to serialize through in C++. But the closure data for the Ops is created in python. How do you plan to serialize each piece of closure data?