Closed glarange closed 4 years ago
I would suggest two changes:
Function
instances, as done in the function dd._copy._copy_bdd
:
https://github.com/tulip-control/dd/blob/24ec9eba335c173e1fe0367a9ac25ec201dee1b5/dd/_copy.py#L60shelve
, as done in the function dd._copy.dump_json
:
https://github.com/tulip-control/dd/blob/24ec9eba335c173e1fe0367a9ac25ec201dee1b5/dd/_copy.py#L125-L139
Please note that str(k)
is needed, where k
is the integer identifier of the BDD node, as demonstrated in the function dd._copy._dump_bdd
:
https://github.com/tulip-control/dd/blob/24ec9eba335c173e1fe0367a9ac25ec201dee1b5/dd/_copy.py#L167-L190
The need for string conversion is because shelve
supports only strings as keys, quoting its documentation:
A “shelf” is a persistent, dictionary-like object. The difference with “dbm” databases is that the values (not the keys!) in a shelf can be essentially arbitrary Python objects — anything that the pickle module can handle. This includes most class instances, recursive data types, and objects containing lots of shared sub-objects. The keys are ordinary strings.
Is (1) above a prerequisite for (2) or are they independent? Here's how I would do (1). Can you give some guidance on how to implement (2) for the function below?
from dd.autoref import BDD
def recurse_prob(u, var_prob, mem):
# constant node ?
if u.var is None:
return (0,0) if u.negated else (1,0)
z = rectify(u)
k = int(z)
# memoized ?
if k in mem:
(r,v) = mem[k]
(r,v) = flip(r,v, u)
return (r,v)
# recurse
p = recurse_prob(u.low, var_prob, mem)
q = recurse_prob(u.high, var_prob, mem)
r = p[0] + var_prob[u.var] * (q[0] - p[0])
v=(1-var_prob[u.var])*failr[u.var]*(q[0]-p[0])+(1-var_prob[u.var])*p[1]+var_prob[u.var]*q[1]
# memoize
mem[k] = (r,v)
(r,v) = flip(r,v, u)
return (r,v)
def rectify(u):
return ~ u if u.negated else u
def flip(r,v, u):
if u.negated:
r = 1 - r
v = 1 - v
return (r,v)
I would say that item 2 above includes item 1, because if u
is an instance of dd.cudd.Function
(or dd.autoref.Function
), then str(u)
is of the form '@'
followed by the digits of an integer, so the results of str(int(u))
and str(u)
are almost the same (the difference is in the @
character).
The implementation above (https://github.com/tulip-control/dd/issues/67#issuecomment-733123560) of item 1 from above (https://github.com/tulip-control/dd/issues/67#issuecomment-731902452) looks good.
About implementing item 2 from above (https://github.com/tulip-control/dd/issues/67#issuecomment-731902452), at the calling site of function recurse_prob
, the argument mem
would be a shelve, instead of a dict
. For example, similarly to the body of the function dd._copy.dump_json
:
# Copyright 2016-2018 by California Institute of Technology
# All rights reserved. Licensed under 3-clause BSD.
#
import os
import shelve
import shutil
SHELVE_DIR = '__shelve__'
def call_recurse_prob(u, var_prob):
tmp_fname = os.path.join(SHELVE_DIR, 'temporary_shelf')
os.makedirs(SHELVE_DIR)
try:
cache = shelve.open(tmp_fname)
r, v = recurse_prob(u, var_prob, cache)
cache.close()
finally:
shutil.rmtree(SHELVE_DIR)
return r, v
and in the body of the function recurse_prob
a change to k = str(int(z))
.
I would also recommend spreading long expressions over multiple lines, which is possible by using parentheses, for example:
x = (a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8 + a9 + a10 +
a11 + a12 + a13 + a14 + a15 + a16 + a17 + a18 + a19 + a20)
Parentheses allow implicit line continuation, without the need for an explicit line break \
. I would recommend following PEP 8, for example 2 lines between functions, space after comma, space between infix operators.
Thank you for the suggestions. Yes, I can follow PEP8, including line breaks (Something I learnt today!)
Let me try the "shelving" now.
I tried the shelve cache as shown below. It produces same results as the prior version, but kernel still crashes when using more than 1000 nodes or so. I think I'm missing something because I still use the dictionary in the original function recurse_prob. Is the resurse_prob how it should be?
def recurse_prob(u, var_prob, mem):
# constant node ?
if u.var is None:
return (0,0) if u.negated else (1,0)
z = rectify(u)
k = str(int(z))
# memoized ?
if k in mem:
(r,v) = mem[k]
(r,v) = flip(r,v, u)
return (r,v)
# recurse
p = recurse_prob(u.low, var_prob, mem)
q = recurse_prob(u.high, var_prob, mem)
r = p[0] + var_prob[u.var] * (q[0] - p[0])
v=((1-var_prob[u.var])*failr[u.var]*(q[0]-p[0])+
(1-var_prob[u.var])*p[1]+var_prob[u.var]*q[1])
# memoize
mem[k] = (r,v)
(r,v) = flip(r,v, u)
return (r,v)
def rectify(u):
return ~ u if u.negated else u
def flip(r,v, u):
if u.negated:
r = 1 - r
v = 1 - v
return (r,v)
def call_recurse_prob(u, var_prob):
tmp_fname = os.path.join(SHELVE_DIR, 'temporary_shelf')
os.makedirs(SHELVE_DIR)
try:
cache = shelve.open(tmp_fname)
r, v = recurse_prob(u, var_prob, cache)
cache.close()
finally:
shutil.rmtree(SHELVE_DIR)
return r, v
pBDD = call_recurse_prob(u, vars_prob)
The implementation appears to be correct (the only remaining dict
used is var_prob
, which is the unchanged through the recursion--mem
is now a shelve, i.e., a dictionary-like object on the disk). A del z
could be added after the line k = str(int(z))
to reduce the Function
instances that are present in the depth-first traversal.
Other points to consider when using dd.cudd
are increasing the total memory for the manager, for example bdd = dd.cudd.BDD(3 * cudd.GB)
, and obtaining statistics using the method bdd.statistics
. The module dd.cudd
can work with millions of nodes in the BDD manager.
If using dd.autoref
, then dynamic reordering is disabled by default, and can be enabled with bdd.configure(reordering=True)
where bdd
an instance of the class dd.autoref.BDD
.
The size of a particular BDD (not the total nodes in the BDD manager), i.e., the number of nodes that are reachable from a given node, can be read from the attribute dag_size
, for example print(u.dag_size)
before calling the function call_recurse_prob
.
Also, I do not know whether this code is run in a virtual machine, but if so it might have reduced resources compared to the actual machine.
Added del z
. Kernel still crashes. I'm not using virtual machine.
Regarding dd.cudd, am unable to import. I get a ModuleNotFoundError
. Should I open a separate issue for that?
Thanks, Gui
Usage of dd
with a large number of nodes is intended with dd.cudd
. A ModuleNotFoundError
about dd.cudd
means that it is not installed. The file README.md
contains information about how to install dd.cudd
(https://github.com/tulip-control/dd#wheel-files-with-compiled-cudd and https://github.com/tulip-control/dd#dd-fetching-cudd). Recently dd == 0.5.6
has been released and the PyPI files include a manylinux_x86_64
wheel for Python 3.9 that contains the module dd.cudd
(i.e., compiled). So on Linux and Python 3.9 a pip install dd
is expected to install this wheel file and so dd.cudd
too. On other Python versions and operating systems python setup.py install --fetch --cudd
as described in https://github.com/tulip-control/dd#dd-fetching-cudd is expected to build and install dd.cudd
(running this inside either the repository or an unpacked source distribution from PyPI).
Also, even if dd.cudd
is installed, running python
or ipython
within the top directory of this repository, or of the source distribution of dd
, and attempting import dd.cudd
, will result in a ModuleNotFoundError
, because Python will attempt to import dd
from its source directory, instead of the dd
that is installed under site-packages
. So running python
or ipython
outside the source distribution of dd
is needed. This observation could be useful immediately after completing an installation of dd
, because trying to check that dd.cudd
has been installed (e.g., by running python -c "import dd.cudd"
) would fail if running in the same directory (a change of directory is first needed, after completing the installation, and before checking that dd.cudd
has been installed successfully).
When installing with python setup.py
(instead of pip install
) it may be useful to first run pip uninstall -y dd
to remove any existing installation of dd
.
OK, I will continue this thread on a separate issue, thanks.
Hi @johnyf , The recursive function you helped me write is working with some fairly large BDDs, up to 5K nodes. But use cases require even larger. The biggest constraint with the recursive function isn't speed but memory use. We included memorization already. What other solutions to optimize memory use could you suggest? Is it possible to use hard disk memory as virtual RAM, for example? Thanks, G