wmayner / pyphi

A toolbox for integrated information theory.
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006343
Other
372 stars 97 forks source link

how use to use `subsystem.evaluate_partition` #11

Closed AjayTalati closed 7 years ago

AjayTalati commented 7 years ago

Hi, I wonder if you can help me? I'm a bit new to this work and API, so this might be a silly question?

I've installed the latest development version, and I'm trying to calculate small phi over different partitions. I run this code,

import pyphi

from pyphi.subsystem import mip_bipartitions

network = pyphi.examples.fig6()
state = (1, 0, 0)
subsystem = pyphi.Subsystem(network, state, range(network.size))
A,B,C = subsystem.node_indices

mechanism = subsystem.node_indices 
purview   = subsystem.node_indices

all_bipartitions = mip_bipartitions( mechanism, purview )

partition = all_bipartitions[1]
direction = "|past|" 

Can you help me use the following methods please? I don't understand why I'm getting these erros?

>>> subsystem.evaluate_partition( direction, (A,), (A, B, C) , partition )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ajay/anaconda3/lib/python3.6/site-packages/pyphi/subsystem.py", line 532, in evaluate_partition
    partitioned_repertoire = self.partitioned_repertoire(direction, partition)
  File "/home/ajay/anaconda3/lib/python3.6/site-packages/pyphi/subsystem.py", line 434, in partitioned_repertoire
    return part1rep * part2rep
TypeError: unsupported operand type(s) for *: 'NoneType' and 'NoneType'
>>> 
>>> subsystem.find_mip( direction, (A,) , (A,B,C) )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ajay/anaconda3/lib/python3.6/site-packages/pyphi/subsystem.py", line 587, in find_mip
    unpartitioned_repertoire=unpartitioned_repertoire)
  File "/home/ajay/anaconda3/lib/python3.6/site-packages/pyphi/subsystem.py", line 532, in evaluate_partition
    partitioned_repertoire = self.partitioned_repertoire(direction, partition)
  File "/home/ajay/anaconda3/lib/python3.6/site-packages/pyphi/subsystem.py", line 434, in partitioned_repertoire
    return part1rep * part2rep
TypeError: unsupported operand type(s) for *: 'NoneType' and 'NoneType'

I guess my install is OK, as the functions see to be there

>>> subsystem.evaluate_partition
<bound method Subsystem.evaluate_partition of Subsystem((n0, n1, n2))>
>>> subsystem.find_mip
<bound method Subsystem.find_mip of Subsystem((n0, n1, n2))>

Thank you very much :)

AjayTalati commented 7 years ago

Also can I ask if there's a change in the way Big_phi is now calculated? For the example system of fig12 2014, I get

>>> big_mip = pyphi.compute.big_mip(subsystem)
>>> big_mip.phi
1.999997
>>> big_mip.cut
Cut (0, 1) --//--> (2,)

Where as in the master version, master docs and 2014 paper and it's,

>>> big_mip.phi
1.916665 

Is this just an update in the calculations/theory, or is my install wrong?

AjayTalati commented 7 years ago

Now I'm really confused? How do define a partition vs a bi-partition. If I use the following as a partition,

partition = ((0,), (1, 2))

I get different error message,

>>> partition
((0,), (1, 2))
>>> subsystem.evaluate_partition( direction, (A,), (A, B, C) , partition )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/site-packages/pyphi/subsystem.py", line 532, in evaluate_partition
    partitioned_repertoire = self.partitioned_repertoire(direction, partition)
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/site-packages/pyphi/subsystem.py", line 429, in partitioned_repertoire
    part1rep = self._repertoire(direction, partition[0].mechanism,
AttributeError: 'tuple' object has no attribute 'mechanism'

Any chance you write/document how to use this function please? Thank you very much.

wmayner commented 7 years ago

Hi @AjayTalati,

The documentation is unfortunately slightly confusing and in some cases out of date. For your first problem, the issue is that we use Sphinx's substitution feature, which uses the separator character | to define a substitution like so: |substitute_this|. This helps to generate the online version of the documentation, but is clearly problematic if you're using help or IPython's ?. I'll fix this soon.

What you want is the following:

from pyphi.constants import DIRECTIONS, PAST, FUTURE
direction = DIRECTIONS[PAST]

This is reflected in the online documentation here.

As for your second problem, there are two issues. Firstly, the docs are simply out of date—the partition argument should be a Bipartition, not a tuple of ints. The second issue is conceptual: when evaluating small-phi, the (mechanism / purview) pair is partitioned into two (mechanism / purview) pairs. So rather than giving a partition of the nodes in the subsystem, you need to provide a partition of the mechanism and a partition of the purview. In PyPhi this is represented by a Bipartition, which is a pair of Parts, and each Part is in turn a (mechanism / purview) pair.

So, for example, you could write:

from pyphi.models import Bipartition, Part
partition = Bipartition(Part((0,), ()), Part((), (0, 1, 2)))
#                            ^     ^         ^   ^
#                            |     |         |   |
#                            |     |         |   purview of second part
#                            |     |         mechanism of second part
#                            |     purview of first part
#                            mechanism of first part

Then printing partition will display

0     []
-- X -----
[]   0,1,2

Sorry about the confusion, and please let me know if any of this isn't clear. I'll update the documentation to fix these problems soon.

AjayTalati commented 7 years ago

Hi @wmayner

Wow, thank you so much for the quick and detailed feedback :) Yes, that makes a lot of sense - like seeing daylight after the darkness of night :+1:

No worry with the docs and code, for a very specialised and technical subject lie this their awesome :+1: they've helped immeasurably!

What I'm trying to doing is to experiment with a "guessing" algorithm, which takes in as input tpm, state, cm, mechanism, purview, direction, and maybe other variables such as time scale. It's output is say 10 minimal bi-partitions which roughly give the same small phi as the optimal bi-partition, (as you would get from ranking EMD's in an exhaustive search over all possible bipartitions using subsystem.find_mip). Thus the EMD could be could be calculated exactly and averaged for these 10 best approximations for the optimal bi-partition.

Out of curiosity to understand IIT better, I coded up some search algorithms to find the maximal Big_Phi over possible tpm, cm, state tuples - for only 4 nodes it took 14 hours to converge !!! It found a max Big_Phi of 7.57, and 15 concepts, (using the master, not dev code), I don't know if that's any good, but it was the best it could do, (I can send you the JSON if you're curious).

Basically I'm very interested now in looking for ways to speed up the calculation/approximation of subsystem.find_mip for more realistic sized systems. Some of the modern machine learning algorthms are very fast, so could be complementary to exactly calculating a linear programs/EMD over all possible bi-partitions, to get small_phi. If this sounds like it might be helpful for your work, I'm more than happy to share it!

Thanks once again for your awesome work in creating this impresive system, its the most interesting code I've worked on for a very long time.

With best wishes,

Ajay

wmayner commented 7 years ago

Hi @AjayTalati ,

Glad I could help, and glad you're interested in experimenting with PyPhi! By the way, the documentation has been updated, and temporal directions are now specified with pyphi.constants.Direction, which is an Enum (introduced in Python 3.4). So now you would write:

from pyphi.constants import Direction
direction = Direction.PAST

I'm not sure I follow you with regard to the guessing algorithm. My intuition is that in general, the average small-phi value for randomly-chosen cuts will not approximate the small-phi value for the optimal cut. But heuristics and approximations are always welcome, since the exact algorithm is so intractable, so thank you for working on it. Some sort of random-sampling approach could indeed be fruitful.

I would definitely be interested in taking a look at the 4-node network you found (I'm assuming the search was over deterministic TPMs, i.e. those with 1 or 0 entries?)

Applying machine learning algorithms to the calculation sounds very interesting as well! If you'd like to elaborate on that and share the network you found, you can find my contact information here.

AjayTalati commented 7 years ago

Hi @wmayner,

great to hear from you, sure thing, always happy to share - I couldn't attach the net to this github page, so here's a dropbox link to the JSON,

https://www.dropbox.com/s/yj1n6r0wwbxns8d/network.json?dl=0

if you download it, (I don't think you need a dropbox account to do that, just click Download in the top right corner), and import it into your visualisation page, you should be able to confirm it gets big_phi = 7.56975, and the maximal number of concepts possible 15, for a 4-node net.

Yep I confirm that is deterministic - I'm reading the recent paper Marshall et al (2016), it's given me a lot of guidance, and ideas :)

Note that I did the search for this net using the master/stable version of PyPhi. The alterations to the theory and code in the new dev version seems to make the space of all possible tpm's much harder to search within. I'm rerunning the search, and my code has been increasing for 35 hours so far !!! And it still hasn't converged - I'm getting really bored ???

max_bigphi_search_over_4_nodes_using_dev_pyphi

About the machine learning, I've very happy to collaborate, and openly share/discuss, no need to do stuff by private email. But, on the other hand the issues page of a github project does'nt seem to be the right place to do it either? Perhaps you want to setup a public google group for PyPhi developers, when you have some spare time - I think it's pretty straighforward to do? Be great to make things more open/lower the entry barrier, and get more people involved from diverse backgrounds/skillsets?

Anyhow's the ML is VERY EXPERIMENTAL, (i.e. I try a lot of stuff that usually does'nt work), and not so easy to explain, at least I'm not so good at explaining it :( So rather than possibly confusing you with stuff that I haven't confirmed works, maybe it's best I get back to you when I've got some concrete results? The entry barrier to ML can be quite uncomfortable, so probably best to wait until someone, (i.e. me), has proved that they can get it to work for your particular use case - collaborating between an ML researcher, and a domain specialist is usually how things evolve?

Basically, at the moment there's two ways I see it being useful to you

i) it's pretty good for optimising stuff in high dimensions, where a genetic algo doesn't do so well - so promising as a complementary alternative/or another way to do animats type experiments. A sort of proof principle/sanity check is the 4-node network posted above.

ii) it's good at cutting down a search space over partitions, especially when things get big. So basically it could look at a tpm, mechanism, purview tuple and then suggest a small space of bi-partitions over which the EMD should be calculated, what tried to say above. For example, here's some back of the envelope numbers, if you have n=10 elements/nodes, and

mechanism = subsystem.node_indices 
purview = subsystem.node_indices 
all_bipartitions = mip_bipartitions( mechanism, purview ) 
num_all_all_bipartitions = len( all_bipartitions )

then num_all_all_bipartitions = 524287, pretty funky!!! If I understand correctly, (and that's a big IF), does'nt subsystem.find_mip involve a loop over all bi-partitions? From the dev code base of pyphi/subsystem.py#L581

# Loop over possible MIP bipartitions
for partition in mip_bipartitions(mechanism, purview):
    # Find the distance between the unpartitioned and partitioned
    # repertoire.
    phi, partitioned_repertoire = self.evaluate_partition(
        direction, mechanism, purview, partition,
        unpartitioned_repertoire=unpartitioned_repertoire)

    # Return immediately if mechanism is reducible.
    if phi == 0:
        return _mip(0.0, partition, partitioned_repertoire)

    # Update MIP if it's more minimal.
    if phi < phi_min:
        phi_min = phi
        mip = _mip(phi, partition, partitioned_repertoire)

Now since evaluate_partition calculates an EMD, which is basically a linear program, and when n gets reaonably big even with pyemd it's slooooow - that seems totally nuts !!! Also since for n=10, there are 2^10-1= 1023 possible mechanisms and also purviews, and that just for one tpm - and recall I want to search to maxmize big_phi over the space of all possible tpms, which for n=10 are Markov matrices of size 1024x1024 - let's just say to my naive mind, there must be some symmeties we can learn that will cut down the number of possible bi-partitions, when we learn over the space of all possible tpms. Quite a mouthful, but that's what I want! So this is where I think I can contribute?

Put anther way, a ML algorithm would basically be a FAST !!! function which you could drop into your code around line pyphi/subsystem.py#L581, which has the signature

def cut_down_bipartion_space (tpm, all_bipartitions = mip_bipartitions( mechanism, purview ) :

    does ML stuff

    return 100 bipartitions

small_space_bipartitons = cut_down_bipartion_space (tpm, all_bipartitions = mip_bipartitions( mechanism, purview )

# Loop over smaller space of possible MIP bipartitions
for partition in small_space_bipartitons:
    # Find the distance between the unpartitioned and partitioned
    # repertoire.
    phi, partitioned_repertoire = self.evaluate_partition(
        direction, mechanism, purview, partition,
        unpartitioned_repertoire=unpartitioned_repertoire)

Something like this anyway? Since this is a pretty fundamental calculation, even if the IIT theory changes, (and I'm sure it will), or you want to experiment with alternative methods of doing calculations, for big_mip, or whatever - the calculation speed of subsystem.find_mip for reasonably realistic sized nets say n>10, seems like it's always going to be the bottleneck?

Well I hope this has'nt been too confusing? In any event I'm always happy to hear from you, my email's on my github page, ajaytalati

Best,

Ajay

AjayTalati commented 7 years ago

Well it converged in the end after 42 hours !!!! The best it got was, big_phi=7.632.

convergence_4_nodes_dev_pyphi

To be honest I haven't really spent enough time studying the latest theory, and the subsequent changes to the code from the stable to the latest dev version, but it seems to give roughly the same maximum value, (at least the best I can find), for big_phi.

JSON just in case you're curious,

https://www.dropbox.com/s/705c2ne4gpj13qo/network_4_nodes_dev_pyphi.json?dl=0