pymc-devs / pytensor

PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.
https://pytensor.readthedocs.io
Other
354 stars 104 forks source link

BUG: pytensor.tensor.random.utils.params_broadcast_shapes does not raise errors when shapes do not broadcast #152

Open lucianopaz opened 1 year ago

lucianopaz commented 1 year ago

Describe the issue:

pytensor.tensor.random.utils.params_broadcast_shapes says that it performs numpy broadcasting on shape tuples. This works kind of fine, unless the shapes are not broadcastable with each other. In those cases, params_broadcast_shapes will happily return a shape that is the maximum between the input shape tuples.

This causes a set of problems down the line.

  1. pytensor.tensor.random.utils.params_broadcast_shapes is used by default in RandomVariable.infer_shape. This causes potentially wrong shape inferences, that might raise an exception at runtime.
  2. If the output shape returned by pytensor.tensor.random.utils.params_broadcast_shapes is used to broadcast_to, the resulting tensors will look like they would work fine, but they can lead to segfaults or kernel crashes at runtime.

Reproducable code example:

import pytensor
from pytensor import tensor as pt
import numpy as np

a = pt.as_tensor_variable(np.zeros((3, 2)), name="a")
b = pt.as_tensor_variable(np.ones((1, 4)), name="b")
x = pt.random.normal(name="x", loc=a, scale=b)
x.shape.eval()  # Prints array([3, 4])
x.eval().shape  # ValueError: shape mismatch: objects cannot be broadcast to a single shape.  Mismatch is between arg 0 with shape (3, 2) and arg 1 with shape (1, 4).

from pytensor.tensor.random.utils import params_broadcast_shapes
a_shape, b_shape = params_broadcast_shapes([a.shape, b.shape], ndims_params=[0, 0])
y = pt.random.normal(
    name="y",
    loc=pt.broadcast_to(a, a_shape),
    scale=pt.broadcast_to(b, b_shape),
)
y.shape.eval()  # prints array([3, 4])
y.eval().shape  # Produces a segmentation fault (core dumped)

Error message:

<details>
<summary>x.eval().shape leads to this error</summary>
ValueError: shape mismatch: objects cannot be broadcast to a single shape.  Mismatch is between arg 0 with shape (3, 2) and arg 1 with shape (1, 4).
Apply node that caused the error: normal_rv{0, (0, 0), floatX, False}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F0B6A29C740>), TensorConstant{[]}, TensorConstant{11}, a{(3, 2) of 0.0}, b{(1, 4) of 1.0})
Toposort index: 0
Inputs types: [RandomGeneratorType, TensorType(int64, (0,)), TensorType(int64, ()), TensorType(float64, (3, 2)), TensorType(float64, (1, 4))]
Inputs shapes: ['No shapes', (0,), (), (3, 2), (1, 4)]
Inputs strides: ['No strides', (0,), (), (16, 8), (32, 8)]
Inputs values: [Generator(PCG64) at 0x7F0B6A29C740, array([], dtype=int64), array(11), 'not shown', array([[1., 1., 1., 1.]])]
Outputs clients: [[], ['output']]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/terminal/interactiveshell.py", line 678, in interact
    self.run_cell(code, store_history=True)
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2940, in run_cell
    result = self._run_cell(
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2995, in _run_cell
    return runner(coro)
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3194, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3373, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/home/lpaz/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-16-840c5bd8ff9e>", line 1, in <module>
    x = pt.random.normal(name="x", loc=a, scale=b)

HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
</details>

<details>
<summary>y.eval().shape leads to this error</summary>
Segmentation fault (core dumped)
</details>

PyTensor version information:

Installed via pip pytensor version is 2.8.11

Context for the issue:

No response

ricardoV94 commented 1 year ago

This looks more like a bug in broadcast_to, which should do some input validation in the perform method.

import pytensor
from pytensor import tensor as pt
import numpy as np

a = pt.as_tensor_variable(np.zeros((3, 2)), name="a")
a_bcast = pt.broadcast_to(a, (3, 4))
a_bcast.eval()  # Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

Numpy does the right thing:

a = np.zeros((3, 2))
np.broadcast_to(a, (3, 4))
# ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,2)  and requested shape (3,4)
lucianopaz commented 1 year ago

175 fixed broadcast_to, but pytensor.tensor.random.utils.params_broadcast_shapes still returns invalid broadcast shapes