Using coach with graph_nets

timokau commented 5 years ago

I'm trying to use coach to train a DQN using a Graph Neural Network as its Q-network. Such a network takes a graph (with node and edge attributes) as its input and returns a new graph (same nodes and edges, but different attributes) as its output.

In my case each edge is supposed to represent an action, with the Q-value being encoded as an edge attribute.

Reading through the documentation and source I've discovered that coach makes several assumptions that seem incompatible with this idea:

it is assumed that Spaces have a fixed size. With graphs, the observation and action space can have varying sizes for different graphs (which is one of their main advantages)
it is assumed that embedders return a fixed-size vector. Again, this is not the case with graphs

So what I want is: graph comes in as an observation, graph comes out, the edge with the biggest Q-value is chosen as the next action (in greedy mode)

Before I invest more time in this: Do you think it is feasible to use coach with graph_nets in its current architecture? Can I somehow work around the fixed size spaces and embedders?

timokau commented 5 years ago

For the record: It is at least somewhat possible, although not without significant hacks. Not sure if I'd recommend it to myself 7 days ago.

timokau commented 5 years ago

If someone with the same issue reads this in the future: I had much more success implementing this on top of baselines instead of coach. Here's how I did it: https://github.com/timokau/wsn-embedding-rl/blob/8afa770dfe0c419e0b79dc796b4e4e3fcfa6548d/dqn_agent.py#L32 (with this baselines PR applied).

IntelLabs / coach

Using coach with graph_nets #331