jcrozum / pystablemotifs

Python library for attractor identification and control in Boolean networks
MIT License
28 stars 7 forks source link

maxOutput in PyBoolNet and a potential associated error #30

Closed jgtz closed 3 years ago

jgtz commented 4 years ago

When calling the ASP solver from PyBoolNet, there seems to be a threshold for the max number of solutions that it will output.

https://github.com/hklarner/PyBoolNet/blob/master/PyBoolNet/AspSolver.py

We probably want to remove that (or have some warning telling us that this is where it might get stuck). It also seems to be behind an error.

Code

N=1000
K=3
p=sm.RandomBooleanNetworks.get_criticality_p_Kauffman(K)[0] 
N_ensemble=10
seed=1000
rbn_ensemble=sm.RandomBooleanNetworks.Random_Boolean_Network_Ensemble_Kauffman(N,K,p,N_ensemble,seed=seed,write_Boolean_network=True)

start=default_timer()
rules=rbn_ensemble[4]
rules = sm.Format.booleannet2bnet(rules)
primes = PyBoolNet.FileExchange.bnet2primes(rules)
PyBoolNet.PrimeImplicants._percolation(primes,True)
end=default_timer()
print("Time (s) creating reduced networks:",end-start)
print("Reduced network size: ",str(len(primes)))
#sm.Format.pretty_print_prime_rules(primes)

start=default_timer()
diag = sm.Succession.build_succession_diagram(primes)
end=default_timer()
print("Time (s) finding atttactors:",end-start)
print("Number of attractors: ",str(len(diag.attractor_fixed_nodes_list)))
diag.attractor_candidate_summary()

I looked at the error and it first calls this ASP threshold, but then also gives an error

Time (s) creating reduced networks: 0.8301923440000039
Reduced network size:  412
There are possibly more than 1000 trap spaces.
Increase MaxOutput to find out.
b''
b'/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n...  !n720 | n722 | n723 | !n726 | !n733 | n734 | n735 | n740 | n742 | !n745 ...\n    ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n...  !n720 | n722 | n723 | !n726 | !n733 | n734 | n735 | n740 | n742 | !n745 ...\n    ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n...  !n720 | n722 | n723 | !n726 | !n733 | n734 | n735 | n740 | n742 | !n745 ...\n    ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n...  !n720 | n722 | n723 | !n726 | !n733 | n734 | n735 | n740 | n742 | !n745 ...\n    ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n...  !n720 | n722 | n723 | !n726 | !n733 | n734 | n735 | n740 | n742 | !n745 ...\n    ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n... \x00 | n768 | n778 | n795 | !n8 | n820 | !n823 | n824 | n826 | n863 | n866  ...\n                                                                                                ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n... \x00 | n768 | n778 | n795 | !n8 | n820 | !n823 | n824 | n826 | n863 | n866  ...\n                                                                                                                                                                                                                                                                                                   ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n... \x00 | n768 | n778 | n795 | !n8 | n820 | !n823 | n824 | n826 | n863 | n866  ...\n                                                                                                                                                                                                                                                                                                                                         ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n... \x00 | n768 | n778 | n795 | !n8 | n820 | !n823 | n824 | n826 | n863 | n866  ...\n                                                                                                                                                                                                                                                                                                                                                                                             ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: 0: too many inputs\n... \x00 | n768 | n778 | n795 | !n8 | n820 | !n823 | n824 | n826 | n863 | n866  ...\n                                                                                                                                                                                                                                                                                                                                                                                                          ^\n/Users/jgtz/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/Dependencies/eqntott/eqntott_mac64: Too many errors.\n'

Call to "eqntott" resulted in return code 1
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-4-6cfb9d348159> in <module>
     10 
     11 start=default_timer()
---> 12 diag = sm.Succession.build_succession_diagram(primes)
     13 end=default_timer()
     14 print("Time (s) finding atttactors:",end-start)

~/anaconda3/envs/py37_SM/lib/python3.7/site-packages/StableMotifs/Succession.py in build_succession_diagram(primes, fixed, motif_history, diagram, merge_equivalent_motifs, max_simulate_size, prioritize_source_motifs)
    585     if fixed is None:
    586         fixed = {}
--> 587     myMotifReduction=sm_reduction.MotifReduction(motif_history,fixed.copy(),primes,max_simulate_size=max_simulate_size,prioritize_source_motifs=prioritize_source_motifs)
    588     if diagram is None:
    589         diagram = SuccessionDiagram()

~/anaconda3/envs/py37_SM/lib/python3.7/site-packages/StableMotifs/Reduction.py in __init__(self, motif_history, fixed, reduced_primes, max_simulate_size, prioritize_source_motifs)
    254                     break
    255             if self.terminal == "possible":
--> 256                 self.rspace_constraint = sm_format.pretty_print_rspace(self.rspace)
    257                 self.reduced_rspace_constraint = sm_rspace.reduce_rspace_string(self.rspace_constraint,self.fixed_rspace_nodes)
    258                 self.rspace_update_primes = reduce_primes(self.fixed_rspace_nodes,self.reduced_primes)[0]

~/anaconda3/envs/py37_SM/lib/python3.7/site-packages/StableMotifs/Format.py in pretty_print_rspace(L, simplify, silent)
    164     if simplify:
    165         print(s)
--> 166         s = PyBoolNet.BooleanLogic.minimize_espresso(s)
    167     if not silent:
    168         print(s)

~/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/BooleanLogic.py in minimize_espresso(Expression, Outputfile, Merge, Equiv, Exact, Reduce)
    152         eqntott_cmd += ['/dev/stdin']
    153         eqntott_in = Expression
--> 154         eqntott_out = run_eqntott(eqntott_cmd, eqntott_in)
    155 
    156         if int(re.search(r'\.p\s\d+', eqntott_out).group().strip(".p ")) != 0:

~/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/BooleanLogic.py in run_eqntott(eqntott_cmd, eqntott_in)
     75         eqntott_out, eqntott_err = eqntott.communicate(input=eqntott_in.encode())
     76         eqntott.stdin.close()
---> 77         _eqntott_error(eqntott, eqntott_out, eqntott_err)
     78         return(eqntott_out.decode())
     79 

~/anaconda3/envs/py37_SM/lib/python3.7/site-packages/PyBoolNet/BooleanLogic.py in _eqntott_error(eqntott, eqntott_out, eqntott_err)
     57                 print(eqntott_err)
     58                 print('\nCall to "eqntott" resulted in return code %i'%eqntott.returncode)
---> 59                 raise Exception
     60 
     61 

Exception:

I looked at the function just before the error (attached, it is very large, I added a line jump after each closing parenthesis CNF_error.txt ). I noticed that it is a conjunctive normal form, except in the first few terms:

( !n721 & n842 ) & ( !n721 & !n778 & n95 | !n721 & n778 & !n95 ) & (!n129 | !n136 | ...) & ... CNterms

Are these maybe the ones causing the error?

jcrozum commented 4 years ago

You are probably right about the lines causing the error. You could try adding simplify=False to the on line 256 of Reduction.py. That's a temporary fix at best. I'll think of ways around it. I don't think we really need to compute the string representation anyway.

I'm running into a different OS specific error on my end. PyBoolNet always checks to see if your input is pointing to a file, but if the input is too long, it causes a crash in Windows. These issues taken together suggest that we should really try to avoid espresso passes.

Regarding the MaxOutput, maybe this should be an optional argument for the diagram builder.

jcrozum commented 4 years ago

I've added a new branch to play around with the rspace representations. It seems to work so far.

I'm still running into problems with your example because the deletion reduction is taking way too long to compute. It gets slower as we increase the in-degree of deleted nodes or if the deleted nodes appear in a complicated way. I'll see if I can find a way to speed it up.

jcrozum commented 4 years ago

I added the MaxOutput argument (it's called max_stable_motifs for us). Still having trouble with the K=3,N=1000 example though. I've sped up some of the other examples by quite a bit, but I think the performance of the VC method on relatively dense networks is inherently bad. Maybe we can think of a way to do some of the work directly on the expanded network to save time.

jgtz commented 4 years ago

I had 2 out of 10 of the K=3, N=1000 examples give me trouble and took too long (stuck in the reduction step, like you say). I agree with you that working in the expanded network would be the ideal, but would require some extra figuring out. Maybe an intermediate alternative that would not require extra figuring out is to work on the wiring diagram as much as possible, and only send it to the VC method once that is no longer possible.

So:

  1. From wiring diagram only, choose a set of nodes to reduce out at once. The only restriction would be wiring diagram self-loops. The final function would just be a composition of functions that can be calculated in a single multi-composition step.
  2. VC reduction of 1. Create new wiring diagram.
  3. If any self-loops are removed in (2), go back to (1).

This way we would only do a VC reduction every couple of times a new functional self-loop is removed. We would also do multiple node reductions at once instead of one node at a time (which is what is done, right?).

What do you think?

jcrozum commented 4 years ago

I'm not sure I'm following you. The restrictions in step 1 are more than just self-loops. For example, A=B, B=A will not allow us to reduce by deleting both A and B. Maybe we can make it work by looking at the relationship between successors and predecessors of selected nodes, but I don't see how deleting several nodes at once gives us any speedup.

I'm 99% sure that the "slow" step in the VC reduction is simplifying the rules you get after plugging in f_X for X when you delete X. I'm wondering if we can speed things up by doing this pseudo-algebraically (i.e., with the expanded network directly).

jcrozum commented 4 years ago

I am going to try using sympy to speed things up. Not sure if it will work, but it's worth a shot.

jgtz commented 4 years ago

I agree with you on what the slow step is. The "deleting several nodes at once" would allow us to simplify the rules less times (once instead of once per node deleted). I do agree that there should be a faster way using the expanded network. What I was trying to propose is to reduce the number of times that rule simplification happens taking advantage of the graph structure. Let me see if I can quickly implement what I am thinking,

jcrozum commented 4 years ago

Sounds good. Let me know what you find.

By the way, I think a big part of the problem is that PyBoolNet recalculates truth tables when you import rules after plugging things in. But we already have partial information about the DNF and the DNF of the negations before plugging in the f_X after deletion. Maybe we can build our own importer that can take these hints and can avoid passes to espresso altogether.

The idea would be to use sympy to simplify the rules and their negations seperately, then pass those directly to our custom importer, which would just read off the minterms. I'll work on this angle and we'll combine whatever progress we make later.

jcrozum commented 4 years ago

The sympy approach seems to be working well. I only ran it on the example above, but it finished the succession diagram in ~100 seconds. Much better than before! I was never patient enough to wait for it to finish before, so I don't know how much better.

I pushed this to the rspace-representation-tweaks branch (I probably should have started a separate branch, but it's too late).

jgtz commented 4 years ago

Great! I was also never patient enough to let it finish. I probably waited 10 mins at some point, so I think we can safely say it was a >10x improvement for that case. Currently, the "rbn_ensemble[7]" case is the only one taking a while too finish (although it does keep outputting things to console).

I also profiled the "rbn_ensemble[4]" case (now that it finished). It does seem like the rule simplification part is the one taking the bulk of the simulation:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      752    0.080    0.000  254.863    0.339 Reduction.py:48(delete_node)
     2202    0.032    0.000  218.973    0.099 Reduction.py:93(simplify_using_expression_and_negation)
        2    0.071    0.035  216.911  108.456 Reduction.py:117(deletion_reduction)

You can see the whole profiler info for >30 seconds cumulative below

Fri Apr 24 16:24:53 2020    restats

         383758577 function calls (372192620 primitive calls) in 269.846 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   2203/1    0.084    0.000  270.500  270.500 {built-in method builtins.exec}
        1    0.002    0.002  270.354  270.354 <ipython-input-21-4c174d0693c7>:44(testf3)
      2/1    0.003    0.001  269.375  269.375 Succession.py:572(build_succession_diagram)
        2    0.006    0.003  269.171  134.585 Reduction.py:220(__init__)
      752    0.080    0.000  254.863    0.339 Reduction.py:48(delete_node)
     2202    0.032    0.000  218.973    0.099 Reduction.py:93(simplify_using_expression_and_negation)
        2    0.071    0.035  216.911  108.456 Reduction.py:117(deletion_reduction)
     2124    0.011    0.000  193.247    0.091 boolalg.py:436(simplify)
35298/2202    0.191    0.000  193.232    0.088 simplify.py:393(simplify)
3463/1664    0.023    0.000  191.161    0.115 boolalg.py:847(_eval_simplify)
17419/2124    0.101    0.000  151.410    0.071 boolalg.py:431(_eval_simplify)
   484746    2.365    0.000  134.240    0.000 operations.py:410(__new__)
     2674    0.286    0.000  100.910    0.038 boolalg.py:2581(simplify_patterns_or)
    17419    0.739    0.000   89.605    0.005 boolalg.py:2318(simplify_logic)
17419/2124    0.074    0.000   86.205    0.041 boolalg.py:433(<listcomp>)
   205208    1.323    0.000   85.964    0.000 boolalg.py:803(_new_args_filter)
4889/4158    0.043    0.000   81.378    0.020 boolalg.py:702(_eval_simplify)
   495442    2.466    0.000   59.524    0.000 boolalg.py:453(binary_check_and_simplify)
   213920    1.848    0.000   55.926    0.000 relational.py:189(canonical)
   166997    0.140    0.000   45.360    0.000 basic.py:1130(xreplace)
1533315/166997    3.929    0.000   45.220    0.000 basic.py:1195(_xreplace)
        2    0.038    0.019   39.009   19.505 Reduction.py:149(mediator_reduction)
     2322   34.399    0.015   36.257    0.016 InteractionGraphs.py:42(primes2igraph)
   279538    1.264    0.000   35.115    0.000 boolalg.py:665(_new_args_filter)
    32442    9.573    0.000   34.108    0.001 boolalg.py:2035(_simplified_pairs)
        4    0.034    0.008   33.522    8.381 Reduction.py:102(remove_outdag)
   427840    1.041    0.000   32.990    0.000 expr.py:2364(could_extract_minus_sign)
jgtz commented 4 years ago

An update on this part. My effort to do a "deleting several nodes at once" approach seems to have failed miserably. Either it (1) generates an insanely long Boolean function that takes too long to simplify or (2) if done step-by-step but taking advantage of the induced topological order, seems to take longer than what we have. I might try it tomorrow, but this seems like a dead end for now.

jcrozum commented 3 years ago

The rspace-representation-tweaks branch has been merged with the main branch for some time now. More belated issue closing here.