bdd._next_free_int throws exception when using CPython implementation

sean-reed commented 8 years ago

Arguments to the xrange function are limited to the size of a native C long when using CPython which can cause the bdd._next_free_int method to throw an exception (e.g. when self.max_nodes is set to default of sys.maxsize). I suggest using a simple while loop instead with increment in each iteration until a free node is found or itertools.count if generator is preferred.

johnyf commented 8 years ago

Thank you for the observation. The good property of xrange is that it has an upper bound by definition. The CPython limitation of xrange cited above is documented here:

"CPython implementation detail: xrange() is intended to be simple and fast. Implementations may impose restrictions to achieve this. The C implementation of Python restricts all arguments to native C longs (“short” Python integers), and also requires that the number of elements fit in a native C long. If a larger range is needed, an alternate version can be crafted using the itertools module: islice(count(start, step), (stop-start+step-1+2*(step<0))//step)."

xrange(sys.maxsize) instantiates fine, whereas xrange(sys.maxsize + 1) raises an OverflowError: Python int too large to convert to C long. Is this the error that you get? Can you please note the values of start, i, self.max_nodes, len(self._succ) in the call of dd.bdd.BDD._next_free_int that raises the exception?

Based on the description below, _succ cannot become larger than sys.maxsize, so xrange is not expected to raise an OverflowError. Another comment is that, on the machine I am writing from, sys.maxsize == 9223372036854775807. I do not expect not even CUDD to work with this many nodes (it works with at most a few dozen million). So, something else may be happening, for example a memory error.

If it is a memory error due to insufficient space, then there are (at least) two solutions:

try calling reordering dd.bdd.reorder (in the future, it will be called automatically #9)
try using dd.cudd, with CUDD installed (likely need to change only 1 line in user code, the import statement).

New node allocation in `dd.bdd.BDD`

For context, creation of new nodes in dd.bdd.BDD works by maintaining in the attribute BDD._min_free the smallest integer not assigned to some node. The invariant is that all integers < _min_free are currently nodes, and _min_free is not a node. This invariant is preserved as follows:

_min_free initialized to 2
_min_free increased to a free integer returned by BDD._next_free_int, in order to create a new node
_min_free reduced to the integer of a node being deleted, if that integer is smaller than _min_free.

So, the integers 1 .. _min_free - 1 are nodes, _min_free is not a node, and there may exist nodes that are integers larger than _min_free.

The method dd.bdd.BDD._next_free_int starts at the integer _free_int that has just been assigned to a node, and searches linearly for the smallest integer in _free_int .. BDD.max_nodes that is not a node.

The upper limit BDD.max_nodes is initialized to sys.maxsize by default, but can be changed as desired by the user. The documentation of sys.maxsize defines it as:

"The largest positive integer supported by the platform’s Py_ssize_t type, and thus the maximum size lists, strings, dicts, and many other containers can have."

Since dd.bdd.BDD stores the map from (integer) nodes to successors (low, high) in the dict attribute BDD._succ, based on the above it should be possible for _free_int to become larger than sys.max_size, because the invariant would imply that len(_succ) > sys.maxsize which is impossible for a dict in CPython (based on the docs quoted above).

New node allocation in `dd.mdd.MDD`

As a side note, in dd.mdd I have implemented a more time efficient (with trade-off space) algorithm for tracking and assigning free nodes in MDD._allocate. Instead of maintaining the minimal _free_int, it stores the integers of deleted nodes in the set MDD._free, and so no search is needed. When those are exhausted, MDD._max is simply incremented by 1, because no integer > _max has been allocated to any node yet.

That a set is maintained takes more space, but is not an issue, because integers are added to that set only after being removed from _succ, which implies that this addition does not increase memory usage (in other words, those integers were already occupying memory as elements of _succ, so moving them to a set does not increase memory usage). This should be faster. I would like to experiment with it in dd.bdd.BDD at some point.

sean-reed commented 8 years ago

Thanks for the detailed response.

The issue arises when using CPython if sys.maxsize is larger than the maximum positive integer supported by the platforms native C long. For example, on Windows 64 bit the Py_ssize_t type is defined as an 8 byte integer whilst the native C long is only 4 bytes. I suspect you are using either 64 bit Linux or OS X where the native C long is 8 bytes so there would be no overflow.

As you say, even CUDD cannot cope with anywhere near sys.maxsize nodes so one possible solution is to set default for self.max_nodes to something that would fit in a 4 byte integer (e.g. 2**31 -1), another is to avoid using xrange altogether:

def _next_free_int(self, start, debug=False):
"""Return the smallest unused integer larger than `start`."""
  for i in itertools.count(start):
    if i <= self.max_nodes and i not in self._succ:
      if debug:
        for j in xrange(1, start + 1):
          assert j in self, j
      return i
  raise Exception('full: reached `self.max_nodes` nodes.')

johnyf commented 8 years ago

Indeed, accurate guess: I am using 64 bit Linux and 64 bit OS X. I see now that the problem arises by a mismatch between Python's internally declared limitations on sizes and the system's limitations. Also, it doesn't arise because the number of nodes reaches anywhere near the limits, but just because the upper bound given to xrange happens to be larger than the limit of the operating system. (One more rule where Windows is the exception...)

Even though I am willing to use a simple incrementation pattern, it is:

Less readable.
More error prone. Remarkably, even though it appears to be a trivial change, the proposed patch has an error: if debug=True, then the remaining xrange will fail, for sufficiently large start (which won't happen in practice, though theoretically it remains possible). This is why rule 2 here is important.
Going backwards: Python 3 does not have these issues, and Python 2 won't be used much in 3 years from now.

Also, this difference between C long and Py_ssize_t for the xrange implementation is Cpython issue 26428, where it was rejected.

It turns out that my original implementation in Python 2 was correct, because it used sys.maxint, which corresponds to C long and how xrange is limited in Python 2. It as modified in 956e15def27ad374263f9c92ef7097d98aa8202b.

islice could also be used, as shown here, but it is less readable and more complex (PEP 20).

Reverted in 0f98425fd91a67b8f5891a997a0e748b83d7565e to use sys.maxint in Python 2 only, because that corresponds to how xrange is implemented. So, the option "something that would fit in a 4 byte integer" was selected.

sean-reed commented 8 years ago

Excellent, thanks!

tulip-control / dd