pyro-ppl / brmp

Bayesian Regression Models in Pyro
Apache License 2.0
70 stars 8 forks source link

Augment exception message when running generated code #47

Closed neerajprad closed 4 years ago

neerajprad commented 4 years ago

Fixes #38 in a more general way.

This is used to augment information in the stack trace when the generated model throws any exception. A traceback_generated context-manager is provided, which can be used as follows:

with traceback_generated(model.code):
  model.fn(**data)

If the error originates from running the model code, this will augment the stack trace with the model code highlighting the line that throws the error, otherwise this will have no effect. I have only used this in one part of the code that was throwing an exception with Pyro's dev branch, but we can use this more widely if it turns out to be useful. Turns out the AssertionError was due to these lines which I recently fixed in dev 😆, so this wasn't anything alarming:

    # Pyro doesn't insert a chain dim when num_chains==1.
    if num_chains == 1:
        samples = {k: arr.unsqueeze(0) for k,arr in samples.items()}

The end of the exception message shows the difference.

stack trace (before) ``` sample: 100%|██████████| 1/1 [00:00, 34.47it/s, step size=1.25e-01, acc. prob=1.000] tests/test_brm.py:415 (test_parameter_shapes[fitargs0-y ~ 1 + x-non_real_cols0-contrasts0-family0-priors0-expected0]) formula_str = 'y ~ 1 + x', non_real_cols = [], contrasts = {}, family = Normal() priors = [], expected = [('b_0', 'Cauchy', {}), ('sigma', 'HalfCauchy', {})] fitargs = {'backend': Backend(name='Pyro', gen=, prior=, nuts=...numpy=, to_numpy=), 'iter': 1, 'warmup': 0} @pytest.mark.parametrize('formula_str, non_real_cols, contrasts, family, priors, expected', codegen_cases) @pytest.mark.parametrize('fitargs', [ dict(backend=pyro_backend, iter=1, warmup=0), dict(backend=pyro_backend, iter=1, warmup=0, num_chains=2), dict(backend=pyro_backend, algo='svi', iter=1, num_samples=1), dict(backend=pyro_backend, algo='svi', iter=1, num_samples=1, subsample_size=1), # Set environment variable `RUN_SLOW=1` to run against the NumPyro # back end. pytest.param( dict(backend=numpyro_backend, iter=1, warmup=0), marks=pytest.mark.skipif(not os.environ.get('RUN_SLOW', ''), reason='slow')), pytest.param( dict(backend=numpyro_backend, iter=1, warmup=0, num_chains=2), marks=pytest.mark.skipif(not os.environ.get('RUN_SLOW', ''), reason='slow')), ]) # TODO: Remove on next Pyro release. @pytest.mark.xfail('CI' in os.environ, reason='Failure when num_chains > num_cpu; fixed in Pyro master.') def test_parameter_shapes(formula_str, non_real_cols, contrasts, family, priors, expected, fitargs): # Make dummy data. N = 5 formula = parse(formula_str) cols = expand_columns(formula, non_real_cols) df = dummy_df(cols, N) # Define model, and generate a single posterior sample. metadata = metadata_from_cols(cols) desc = makedesc(formula, metadata, family, priors, code_lengths(contrasts)) data = makedata(formula, df, metadata, contrasts) > fit = DefmResult(formula, metadata, contrasts, desc, data).fit(**fitargs) test_brm.py:444: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../brmp/__init__.py:116: in fit return getattr(self.generate(backend), algo)(**kwargs) ../brmp/__init__.py:205: in nuts return self._run_algo('nuts', iter, warmup, num_chains, *args, **kwargs) ../brmp/__init__.py:165: in _run_algo samples = getattr(self.backend, algo)(self.data, self.model, *args, **kwargs) ../brmp/pyro_backend.py:140: in nuts transformed_samples = run_model_on_samples_and_data(model.fn, samples, data) ../brmp/pyro_backend.py:109: in run_model_on_samples_and_data return_values = [run(i) for i in range(num_chains * num_samples)] ../brmp/pyro_backend.py:109: in return_values = [run(i) for i in range(num_chains * num_samples)] ../brmp/pyro_backend.py:107: in run return poutine.condition(modelfn, sample)(**data) ../../pyro/pyro/poutine/messenger.py:8: in _context_wrap return fn(*args, **kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ X = tensor([[ 1.0000, 1.0023], [ 1.0000, 1.1254], [ 1.0000, 0.9030], [ 1.0000, -1.4695], [ 1.0000, 1.1159]]) y_obs = tensor([-0.4766, -0.5663, -1.0067, -0.9970, -0.2597]), dfN = 5 subsample = None, mode = 'full' > ??? E AssertionError :15: AssertionError ```
stack trace (after) ``` sample: 100%|██████████| 1/1 [00:00, 34.71it/s, step size=1.25e-01, acc. prob=1.000] tests/test_brm.py:415 (test_parameter_shapes[fitargs0-y ~ 1 + x-non_real_cols0-contrasts0-family0-priors0-expected0]) AssertionError The above exception was the direct cause of the following exception: formula_str = 'y ~ 1 + x', non_real_cols = [], contrasts = {}, family = Normal() priors = [], expected = [('b_0', 'Cauchy', {}), ('sigma', 'HalfCauchy', {})] fitargs = {'backend': Backend(name='Pyro', gen=, prior=, nuts=...numpy=, to_numpy=), 'iter': 1, 'warmup': 0} @pytest.mark.parametrize('formula_str, non_real_cols, contrasts, family, priors, expected', codegen_cases) @pytest.mark.parametrize('fitargs', [ dict(backend=pyro_backend, iter=1, warmup=0), dict(backend=pyro_backend, iter=1, warmup=0, num_chains=2), dict(backend=pyro_backend, algo='svi', iter=1, num_samples=1), dict(backend=pyro_backend, algo='svi', iter=1, num_samples=1, subsample_size=1), # Set environment variable `RUN_SLOW=1` to run against the NumPyro # back end. pytest.param( dict(backend=numpyro_backend, iter=1, warmup=0), marks=pytest.mark.skipif(not os.environ.get('RUN_SLOW', ''), reason='slow')), pytest.param( dict(backend=numpyro_backend, iter=1, warmup=0, num_chains=2), marks=pytest.mark.skipif(not os.environ.get('RUN_SLOW', ''), reason='slow')), ]) # TODO: Remove on next Pyro release. @pytest.mark.xfail('CI' in os.environ, reason='Failure when num_chains > num_cpu; fixed in Pyro master.') def test_parameter_shapes(formula_str, non_real_cols, contrasts, family, priors, expected, fitargs): # Make dummy data. N = 5 formula = parse(formula_str) cols = expand_columns(formula, non_real_cols) df = dummy_df(cols, N) # Define model, and generate a single posterior sample. metadata = metadata_from_cols(cols) desc = makedesc(formula, metadata, family, priors, code_lengths(contrasts)) data = makedata(formula, df, metadata, contrasts) > fit = DefmResult(formula, metadata, contrasts, desc, data).fit(**fitargs) test_brm.py:444: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../brmp/__init__.py:116: in fit return getattr(self.generate(backend), algo)(**kwargs) ../brmp/__init__.py:205: in nuts return self._run_algo('nuts', iter, warmup, num_chains, *args, **kwargs) ../brmp/__init__.py:165: in _run_algo samples = getattr(self.backend, algo)(self.data, self.model, *args, **kwargs) ../brmp/pyro_backend.py:140: in nuts transformed_samples = run_model_on_samples_and_data(model.fn, samples, data) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = exc_type = , exc_val = AssertionError() exc_tb = def __exit__(self, exc_type, exc_val, exc_tb): tb_info = traceback.extract_tb(exc_tb) filename, line, fn, _ = tb_info[-1] line = line - 1 # only augment if exception is from generated code. if filename == '': exc_lines = self.code.split('\n') exc_lines = '\n'.join(exc_lines[:line] + [exc_lines[line] + '\t<<< ==== ERROR ===='] + exc_lines[line + 1:]) > raise ModelSpecificationError(f'Exception in model code: \n\n {exc_lines}') from exc_type E brmp.utils.ModelSpecificationError: Exception in model code: E E def model(X, y_obs=None, dfN=None, subsample=None, mode="full"): E assert mode == "full" or mode == "prior_and_mu" or mode == "prior_only" E assert (subsample is None) == (dfN is None) E assert type(X) == torch.Tensor E N = X.shape[0] E if dfN is None: E dfN = N E else: E assert len(subsample) == N E M = 2 E assert X.shape == (N, M) E E b_0 = pyro.sample("b_0", dist.Cauchy(torch.tensor(0.0).expand(2), torch.tensor(1.0).expand(2)).to_event(1)) E b = torch.cat([b_0]) E assert b.shape == (M,) <<< ==== ERROR ==== E E if mode == "prior_only": E mu = None E else: E mu = torch.mv(X, b) E E sigma = pyro.sample("sigma", dist.HalfCauchy(torch.tensor(3.0).expand(1)).to_event(1)) E if mode == "full": E with pyro.plate("obs", dfN, subsample=subsample): E y = pyro.sample("y", dist.Normal(mu, sigma.expand(N)), obs=y_obs) E E return {'mu': mu, 'b': b} ../brmp/utils.py:50: ModelSpecificationError ```
neerajprad commented 4 years ago

So that this is always available, could we wrap this around the function immediately after we exec the generated code? If there's a reason why doing so isn't a good idea, perhaps we can still put it there, but do it conditionally?

This is done by default now, but if we see any issues with it, we can turn it on conditionally. I don't think it should have side-effects, but will keep an eye out.

null-a commented 4 years ago

It seems like this will be super useful, thanks @neerajprad.