pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.67k stars 2k forks source link

MissingInputError: Input 0 of the graph #7187

Closed yash-voyager-want closed 5 months ago

yash-voyager-want commented 7 months ago

Describe the issue:

I am trying to setup this parameter estimation problem similar to the one setup here. [https://github.com/xin2zhang/MCDisp/blob/master/example/Dispersion.ipynb] It seems that it is running for them but not for me. It seems to be related to pm.sample() I tried to use different versions of python from 3.7 to 3.10, none seem to solve the issue

Reproduceable code example:

import numpy as np
from pysurf96 import surf96

thk = np.full((10),fill_value=0.1)
thk[-1]=10; # set the thickness of last layer to be 0 to represent half space
vs = np.zeros_like(thk)
vs = np.array([0.4,0.5,0.6,0.6,0.6,0.8,0.85,0.9,1.0,1.0])
vp = 1.16*vs + 1.36
rho = 1.74*vp**0.25
if(vs[0]<1e-4):
    vp[0] = 1.5
    rho[0] = 1.0

freqs = np.linspace(1.33, 10, 20)
periods = 1/freqs

phase = surf96(
    thk,
    vp,
    vs,
    rho,
    periods,
    wave="rayleigh",
    mode=1,
    velocity="phase",
    flat_earth=True)

%matplotlib notebook
import pymc3 as pm
import theano
import theano.tensor as tt

from time import time
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable

class loglike(tt.Op):
    itypes = [tt.dvector]
    otypes = [tt.dscalar]

    def __init__(self, model, periods, data, thick, wave="rayleigh", velocity="phase", mode=1, flat_earth=True, waterlayer=0, sigma=0.003):
        self.model = model
        self.periods = periods
        self.data = data
        self.thick = thick
        self.sigma = sigma
        self.wave = wave # can have a water layer at the top
        self.mode = mode
        self.velocity = velocity
        self.flat_earth = flat_earth
        self.waterlayer = waterlayer

    def perform(self, node, inputs, outputs):
        theta, = inputs
        vs = np.copy(theta)
        vp = 1.16*theta + 1.36
        rho = 1.74*vp**0.25
        if(self.waterlayer==1):
            vs=np.insert(vs,0,0.0)
            vp=np.insert(vp,0,1.5)
            rho=np.insert(rho,0,1.0)

        # set a prior information: the top layer has minimum shear-velocity
        # This ensures the computed phase velocities are phase velocities of Rayleigh or Love waves         
        if(self.waterlayer==0 and any(vs[1:]<vs[0])
          or self.waterlayer==1 and any(vs[2:]<vs[1])):
            lglike = -np.inf
        else:
            phase = self.model(self.thick,vp,vs,rho,self.periods,self.wave,self.mode,self.velocity,self.flat_earth)
            lglike = -(0.5/self.sigma**2)*np.sum((self.data-phase)**2)
            if(all(phase==0)): # failed to find solutions, set lglike to -infinite
                lglike=-np.inf

        outputs[0][0] = np.array(lglike)

data = phase + 0.02*phase*np.random.normal(size=phase.shape) # add 2 percent noise
logl = loglike(surf96, periods, data, thk)
with pm.Model() as model:
    v1 = pm.Uniform('v1',lower=0.33,upper=0.63,shape=(1,))
    v2 = pm.Uniform('v2',lower=0.43,upper=0.79,shape=(1,))
    v3 = pm.Uniform('v3',lower=0.48,upper=0.88,shape=(1,))
    v4 = pm.Uniform('v4',lower=0.51,upper=0.95,shape=(1,))
    v5 = pm.Uniform('v5',lower=0.55,upper=1.05,shape=(1,))
    v6 = pm.Uniform('v6',lower=0.6,upper=1.2,shape=(1,))
    v7 = pm.Uniform('v7',lower=0.58,upper=1.28,shape=(1,))
    v8 = pm.Uniform('v8',lower=0.56,upper=1.36,shape=(1,))
    v9 = pm.Uniform('v9',lower=0.56,upper=1.36,shape=(1,))
    v10 = pm.Uniform('v10',lower=0.6,upper=1.4,shape=(1,))
    theta = pm.math.concatenate([v1,v2,v3,v4,v5,v6,v7,v8,v9,v10],axis=0)
    print(theta.shape)
    pm.DensityDist('likelihood',lambda v: logl(v), observed={'v': theta})

ndraws = 5000; nburn = 1000
with model:
    trace = pm.sample(ndraws, step=pm.Metropolis(tune_inverval=100), 
                      tune=nburn, discard_tuned_samples=True, cores=3, chains=3)

Error message:

/home/yashwant/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/deprecat/classic.py:215: FutureWarning: In v4.0, pm.sample will return an `arviz.InferenceData` object instead of a `MultiTrace` by default. You can pass return_inferencedata=True or return_inferencedata=False to be safe and silence this warning.
  return wrapped_(*args_, **kwargs_)
Multiprocess sampling (3 chains in 3 jobs)
CompoundStep
>Metropolis: [v10]
>Metropolis: [v9]
>Metropolis: [v8]
>Metropolis: [v7]
>Metropolis: [v6]
>Metropolis: [v5]
>Metropolis: [v4]
>Metropolis: [v3]
>Metropolis: [v2]
>Metropolis: [v1]

100.00% [18000/18000 00:42<00:00 Sampling 3 chains, 0 divergences]

/home/yashwant/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/pymc3/step_methods/metropolis.py:226: RuntimeWarning: overflow encountered in exp
  "accept": np.exp(accept),
/home/yashwant/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/pymc3/step_methods/metropolis.py:226: RuntimeWarning: overflow encountered in exp
  "accept": np.exp(accept),
/home/yashwant/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/pymc3/step_methods/metropolis.py:226: RuntimeWarning: overflow encountered in exp
  "accept": np.exp(accept),
/tmp/ipykernel_85642/3230998942.py:21: RuntimeWarning: invalid value encountered in power
  rho = 1.74*vp**0.25
/tmp/ipykernel_85642/3230998942.py:21: RuntimeWarning: invalid value encountered in power
  rho = 1.74*vp**0.25
Sampling 3 chains for 1_000 tune and 5_000 draw iterations (3_000 + 15_000 draws total) took 43 seconds.

---------------------------------------------------------------------------
MissingInputError                         Traceback (most recent call last)
Cell In[27], line 3
      1 ndraws = 5000; nburn = 1000
      2 with model:
----> 3     trace = pm.sample(ndraws, step=pm.Metropolis(tune_inverval=100), 
      4                       tune=nburn, discard_tuned_samples=True, cores=3, chains=3)

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/deprecat/classic.py:215, in deprecat.<locals>.wrapper_function(wrapped_, instance_, args_, kwargs_)
    213         else:
    214             warnings.warn(message, category=category, stacklevel=_routine_stacklevel)
--> 215 return wrapped_(*args_, **kwargs_)

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/pymc3/sampling.py:655, in sample(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, start, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)
    653     if idata_kwargs:
    654         ikwargs.update(idata_kwargs)
--> 655     idata = arviz.from_pymc3(trace, **ikwargs)
    657 if compute_convergence_checks:
    658     if draws - tune < 100:

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/arviz/data/io_pymc3_3x.py:580, in from_pymc3(trace, prior, posterior_predictive, log_likelihood, coords, dims, model, save_warmup, density_dist_obs)
    528 def from_pymc3(
    529     trace=None,
    530     *,
   (...)
    538     density_dist_obs=True,
    539 ):
    540     """Convert pymc3 data into an InferenceData object.
    541 
    542     All three of them are optional arguments, but at least one of ``trace``,
   (...)
    578     InferenceData
    579     """
--> 580     return PyMC3Converter(
    581         trace=trace,
    582         prior=prior,
    583         posterior_predictive=posterior_predictive,
    584         log_likelihood=log_likelihood,
    585         coords=coords,
    586         dims=dims,
    587         model=model,
    588         save_warmup=save_warmup,
    589         density_dist_obs=density_dist_obs,
    590     ).to_inference_data()

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/arviz/data/io_pymc3_3x.py:181, in PyMC3Converter.__init__(self, trace, prior, posterior_predictive, log_likelihood, predictions, coords, dims, model, save_warmup, density_dist_obs)
    178     self.dims = {**model_dims, **self.dims}
    180 self.density_dist_obs = density_dist_obs
--> 181 self.observations, self.multi_observations = self.find_observations()

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/arviz/data/io_pymc3_3x.py:194, in PyMC3Converter.find_observations(self)
    192     elif hasattr(obs, "data") and self.density_dist_obs:
    193         for key, val in obs.data.items():
--> 194             multi_observations[key] = val.eval() if hasattr(val, "eval") else val
    195 return observations, multi_observations

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/graph/basic.py:554, in Variable.eval(self, inputs_to_values)
    552 inputs = tuple(sorted(inputs_to_values.keys(), key=id))
    553 if inputs not in self._fn_cache:
--> 554     self._fn_cache[inputs] = theano.function(inputs, self)
    555 args = [inputs_to_values[param] for param in inputs]
    557 rval = self._fn_cache[inputs](*args)

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/compile/function/__init__.py:337, in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    331     fn = orig_function(
    332         inputs, outputs, mode=mode, accept_inplace=accept_inplace, name=name
    333     )
    334 else:
    335     # note: pfunc will also call orig_function -- orig_function is
    336     #      a choke point that all compilation must pass through
--> 337     fn = pfunc(
    338         params=inputs,
    339         outputs=outputs,
    340         mode=mode,
    341         updates=updates,
    342         givens=givens,
    343         no_default_updates=no_default_updates,
    344         accept_inplace=accept_inplace,
    345         name=name,
    346         rebuild_strict=rebuild_strict,
    347         allow_input_downcast=allow_input_downcast,
    348         on_unused_input=on_unused_input,
    349         profile=profile,
    350         output_keys=output_keys,
    351     )
    352 return fn

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/compile/function/pfunc.py:524, in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    519         si = In(
    520             variable=sv, value=sv.container, mutable=False, borrow=True, shared=True
    521         )
    522     inputs.append(si)
--> 524 return orig_function(
    525     inputs,
    526     cloned_outputs,
    527     mode,
    528     accept_inplace=accept_inplace,
    529     name=name,
    530     profile=profile,
    531     on_unused_input=on_unused_input,
    532     output_keys=output_keys,
    533 )

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/compile/function/types.py:1970, in orig_function(inputs, outputs, mode, accept_inplace, name, profile, on_unused_input, output_keys)
   1968 try:
   1969     Maker = getattr(mode, "function_maker", FunctionMaker)
-> 1970     m = Maker(
   1971         inputs,
   1972         outputs,
   1973         mode,
   1974         accept_inplace=accept_inplace,
   1975         profile=profile,
   1976         on_unused_input=on_unused_input,
   1977         output_keys=output_keys,
   1978         name=name,
   1979     )
   1980     with config.change_flags(compute_test_value="off"):
   1981         fn = m.create(defaults)

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/compile/function/types.py:1584, in FunctionMaker.__init__(self, inputs, outputs, mode, accept_inplace, function_builder, profile, on_unused_input, fgraph, output_keys, name)
   1581     need_opt = True
   1582     # make the fgraph (copies the graph, creates NEW INPUT AND
   1583     # OUTPUT VARIABLES)
-> 1584     fgraph, additional_outputs = std_fgraph(inputs, outputs, accept_inplace)
   1585     fgraph.profile = profile
   1586 else:
   1587     # fgraph is already an optimized one

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/compile/function/types.py:188, in std_fgraph(input_specs, output_specs, accept_inplace)
    184         out_idx += 1
    186 orig_outputs = [spec.variable for spec in output_specs] + updates
--> 188 fgraph = FunctionGraph(orig_inputs, orig_outputs, update_mapping=update_mapping)
    190 for node in fgraph.apply_nodes:
    191     if getattr(node.op, "destroy_map", None):

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/graph/fg.py:162, in FunctionGraph.__init__(self, inputs, outputs, features, clone, update_mapping)
    159     self.add_input(in_var, check=False)
    161 for output in outputs:
--> 162     self.import_var(output, reason="init")
    163 for i, output in enumerate(outputs):
    164     self.clients[output].append(("output", i))

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/graph/fg.py:330, in FunctionGraph.import_var(self, var, reason)
    328 # Imports the owners of the variables
    329 if var.owner and var.owner not in self.apply_nodes:
--> 330     self.import_node(var.owner, reason=reason)
    331 elif (
    332     var.owner is None
    333     and not isinstance(var, Constant)
    334     and var not in self.inputs
    335 ):
    336     from theano.graph.null_type import NullType

File ~/Work/Python_lab/anaconda3/envs/pysurf96/lib/python3.10/site-packages/theano/graph/fg.py:383, in FunctionGraph.import_node(self, apply_node, check, reason)
    370             if (
    371                 var.owner is None
    372                 and not isinstance(var, Constant)
    373                 and var not in self.inputs
    374             ):
    375                 # Standard error message
    376                 error_msg = (
    377                     f"Input {int(node.inputs.index(var))} of the graph (indices start "
    378                     f"from 0), used to compute {node}, was not "
   (...)
    381                     "for more information on this error."
    382                 )
--> 383                 raise MissingInputError(error_msg, variable=var)
    385 for node in new_nodes:
    386     assert node not in self.apply_nodes

MissingInputError: Input 0 of the graph (indices start from 0), used to compute sigmoid(v9_interval__), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.

PyMC version information:

pymc3 3.11.5 python 3.10

Context for the issue:

No response

welcome[bot] commented 7 months ago

Welcome Banner] :tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

ricardoV94 commented 5 months ago

We suggest you try to update PyMC. v3 is no longer maintained.

The signature of DensityDist (now CustomDist) has changed quite a lot since v3. Check the docs for code examples: https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.CustomDist.html

If you have more troubles making it work, please open a topic on our discourse: https://discourse.pymc.io/