wmayner / pyphi

A toolbox for integrated information theory.
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006343
Other
372 stars 97 forks source link

Deep Networks #17

Closed LuCeHe closed 6 years ago

LuCeHe commented 6 years ago

Hi,

Do you have an easy application of this measure to a training Deep NN on TensorFlow or Keras for example? I'd like to see for example how phi evolves during training, if it steadily increases, and which absolute value it achieves.

I'd love to play with it and I bet it would have a huge impact to the spread of interest into IIT.

Thanks for the good work, Luca

wmayner commented 6 years ago

Hi @actla,

We don't have a ready-made example of applying 𝚽 in deep learning. In principle it shouldn't be too difficult to use PyPhi with something like TensorFlow (with the caveat that I'm not very familiar with the TensorFlow API). Since in PyPhi's implementation 𝚽 is a function of a transition probability matrix (TPM) and a binary network state, you'd need to get the transition probabilities from the DNN weights and then discretize each node to 'ON' / 'OFF' states.

That said, there are two major issues to consider:

  1. As I understand it, many neural networks used in machine learning have feed-forward architectures. The 𝚽 value for any feed-forward system is necessarily zero, so analyzing such networks is trivial and uninteresting. If you're working with recurrent networks, analyzing 𝚽 could be more interesting.
  2. Most useful/interesting networks have more than 10 nodes. PyPhi implements an exact algorithm for 𝚽 which is superexponential in the number of nodes, so analyzing systems of more than ~10–12 nodes is infeasible. There are some approximations available as settings in the config module that might be worth trying, but even just representing a network quickly becomes impractical because the size of the TPM grows exponentially.

So this is why there are no straighforward applications of PyPhi to deep learning readily available, but you might be able to do something with very small recurrent networks. 𝚽 has previously been analyzed in small systems that evolve via a genetic algorithm (see e.g. this paper).

In the literature there are much more practical approximations of 𝚽 that PyPhi does not implement; these might be more useful to you. Here are some relevant papers:

LuCeHe commented 6 years ago

That's very interesting, thanks a lot for the references.

About the

discretize each node to 'ON' / 'OFF' states.

there's no formal way yet to handel continuous activations or discrete activations other than binary (like a sigmoid, which could be simplified to 'OFF', 'transition', 'ON' for example)

wmayner commented 6 years ago

Yes, that's correct—PyPhi can only handle binary states at this point. The formulation of IIT in the literature is agnostic about the number of discrete node states, so more than two could be used (e.g. here). By contrast, to my knowledge IIT has not yet been formalized in the continuous setting.