Open nipunbatra opened 2 years ago
Hi @nipunbatra, regarding
render_params
argument to render_model
and possibly printing their constraints if render_distributions == True
. @fehiepsi might also like this feature in NumPyro.render_model()
already has a well-defined return value and thus should preserve that return value for backwards compatibility, you could instead get structural information from the internals of render_model()
, namely using get_model_relations()
and generate_graph_specification()
:
https://github.com/pyro-ppl/pyro/blob/f4fafc5c7fa0dc5a377ceb06ec59a234bf3ac465/pyro/infer/inspect.py#L494-L495
I'd recommend also checking out the other helper functions in pyro.infer.inspect, including get_dependencies().+1 for having an optional render_params
.
I think another thing to potentially consider while addressing this issue could be LaTeX support in renders. I believe Graphviz doesn't support, but Daft-PGM does.
An older issue https://github.com/pyro-ppl/pyro/issues/2980 mentioned the possibility of using Graphviz for layout and then plotting using Daft-PGM. Would increase overhead. Perhaps, could be left as an example for advanced users who run
relations = get_model_relations(model, model_args, model_kwargs)
graph_spec = generate_graph_specification(relations)
DAFT CODE(graph_spec)
Hi @fritzo and team, My name is Karm. I am working with Prof. @nipunbatra and lab colleague @patel-zeel. I made desired changes in the code and tested it on the following examples.
def model_mle_1(data):
mu = pyro.param('mu', torch.tensor(0.),constraint=constraints.unit_interval)
sd = pyro.param('sd', torch.tensor(1.),constraint=constraints.greater_than_eq(0))
with pyro.plate('plate_data', len(data)):
pyro.sample('obs', dist.Normal(mu, sd), obs=data)
data = torch.tensor([1.,2.,3.])
get_model_relations(model_mle_1,model_args=(data,))
{'sample_sample': {'obs': []}, 'sample_param': {'obs': ['sd', 'mu']}, 'sample_dist': {'obs': 'Normal'}, 'param_constraint': {'mu': Interval(lower_bound=0.0, upper_bound=1.0), 'sd': GreaterThanEq(lower_bound=0)}, 'plate_sample': {'plate_data': ['obs']}, 'observed': ['obs']}
render_model(model_mle_1,model_args=(data,),render_distributions=True)
render_model(model_mle_1,model_args=(data,),render_distributions=True,render_params=True)
def model_map_1(data):
k1 = pyro.param('k1',torch.tensor(1.))
mu = pyro.sample('mu', dist.Normal(0, k1))
sd = pyro.sample('sd', dist.LogNormal(mu, k1))
with pyro.plate('plate_data', len(data)):
pyro.sample('obs', dist.Normal(mu, sd), obs=data)
data = torch.tensor([1.,2.,3.])
get_model_relations(model_map_1,model_args=(data,))
{'sample_sample': {'mu': [], 'sd': ['mu'], 'obs': ['sd', 'mu']}, 'sample_param': {'mu': ['k1'], 'sd': ['k1'], 'obs': []}, 'sample_dist': {'mu': 'Normal', 'sd': 'LogNormal', 'obs': 'Normal'}, 'param_constraint': {'k1': Real()}, 'plate_sample': {'plate_data': ['obs']}, 'observed': ['obs']}
render_model(model_map_1,model_args=(data,),render_distributions=True)
render_model(model_map_1,model_args=(data,),render_distributions=True,render_params=True)
def model_map_2(data):
t = pyro.param('t',torch.tensor(1.),constraints.integer)
a = pyro.sample('a', dist.Bernoulli(t))
b = pyro.param('b',torch.tensor(2.))
with pyro.plate('plate_data', len(data)):
pyro.sample('obs', dist.Beta(a, b), obs=data)
data = torch.tensor([1.,2.,3.])
get_model_relations(model_map_2,model_args=(data,))
{'sample_sample': {'mu': [], 'sd': ['mu'], 'obs': ['sd', 'mu']}, 'sample_param': {'mu': ['k1'], 'sd': ['k1'], 'obs': []}, 'sample_dist': {'mu': 'Normal', 'sd': 'LogNormal', 'obs': 'Normal'}, 'param_constraint': {'k1': Real()}, 'plate_sample': {'plate_data': ['obs']}, 'observed': ['obs']}
render_model(model_map_2,model_args=(data,),render_distributions=True)
render_model(model_map_2,model_args=(data,), render_distributions=True, render_params=True)
Broadly, I made the following changes in pyro.infer.inspect.py.
I added a key named sample_param
in the dictionary returned by get_model_relation()
to get a param that depends on a given sample.
In method get_model_relations()
I observed the output of trace.nodes
, I found that there is no provenance tracking for params and I think without provenance tracking, we are not able to get dependent params. Since there is method named _pyro_post_sample()
in class TrackProvenance(Messenger)
which assigning provenance to sample. So I added a similar method for params named _pyro_post_param()
in the same class. This method is called while getting the trace, trace = poutine.trace(model).get_trace(*model_args, **model_kwargs)
.
def _pyro_post_param(self, msg):
if msg["type"] == "param":
provenance = frozenset({msg["name"]}) # track only direct dependencies
value = detach_provenance(msg["value"])
msg["value"] = ProvenanceTensor(value, provenance)
Then, to add values in sample_param
I followed a similar procedure as followed for adding values in sample_sample
.
I added another key named param_constraint
to store constraints of params. This result will be required by the method generate_graph_specification()
.
I added argument named render_params: bool = False
in both methods render_model()
and generate_graph_specification()
. This argument will ensure optional output showing params in graph.
In method generate_graph_specification()
, dictionary node_data
looks like below for sample variable,
node_data[rv] = {
"is_observed": .... ,
"distribution": .... ,
}
I added an additional key constraint
in node_data for param only, Note that following changes apply only when render_params = True
.
node_data[param] = {
"is_observed": False ,
"distribution":None ,
"constraint": constraint
}
Further, edge_list
and plate_groups
will also be updated by adding params data.
In the render_graph()
method, I kept the shape of the param as plain
and I added a code to show the constraint of params.
@fritzo, please give your feedback on this. Can I make PR If the dictionary and graph meet your expectations?
@karm216 this looks great, we'd love PR contributing this feature! Note there are rigorous tests for pyro.infer.inspect
, so we'll need to (1) update a bunch of those tests and (2) add some of your examples as new tests.
Hi, From the MLE-MAP tutorial, we have the following models
MLE model
If we render this, we get something like the following image
MAP model
If we render this, we get something like the following image
Coin toss graphical model images from the popular Maths for ML book
This is from Figure 8.10
We'd expect our MLE model render to look like 8.10 b) and our MAP model to look like 8.10 c)
So, when we have
latent_fairness
as a parameter, it should perhaps just be written aslatent_fairness
and under the MAP model, it should be parameterised by theBeta
distribution.From the pyro render of the MLE model, it is not easily visible how observations are related to
latent_fairness
.Feature Requests
So, I have two questions/requests
pyro.params
also show in renders. The difference in the renders betweenpyro.sample
andpyro.parameter
would be the associated distribution (and thus hyperparams) inpyro.sample
render_model
? For example, once can then use that dictionary to create their own graphical models, for example using tikz-bayesnet. For example, the code below reproduces the Figure 8.10 from MML book shown above.3.Click to toggle contents of `code`
```latex \documentclass[a4paper]{article} \usepackage{caption} \usepackage{subcaption} \usepackage{tikz} \usetikzlibrary{bayesnet} \usepackage{booktabs} \setlength{\tabcolsep}{12pt} \begin{document} \begin{figure}[ht] \begin{center} \begin{tabular}{@{}cccc@{}} \toprule $x_N$ explicit & Plate Notation & Hyperparameters on $\mu$ & Factor\\ \midrule & & & \\ \begin{tikzpicture} \node[obs] (x1) {$x_1$}; \node[const, right=0.5cm of x1] (dots) {$\cdots$}; \node[obs, right=0.5cm of dots] (xn) {$x_N$}; \node[latent, above=of dots] (mu) {$\mathbf{\mu}$}; \edge {mu} {x1,dots,xn} ; % \end{tikzpicture}& \begin{tikzpicture} \node[obs] (xn) {$x_n$}; \node[latent, above=of xn] (mu) {$\mathbf{\mu}$}; \plate{}{(xn)}{$n = 1, \cdots, N$}; \edge {mu} {xn} ; % \end{tikzpicture} & \begin{tikzpicture} \node[obs] (xn) {$x_n$}; \node[latent, above=of xn] (mu) {$\mathbf{\mu}$}; \node[const, right=0.5cm of mu] (beta) {$\mathbf{\beta}$}; \node[const, left=0.5cm of mu] (alpha) {$\mathbf{\alpha}$}; \plate{}{(xn)}{$n = 1, \cdots, N$}; \edge {mu} {xn} ; % \edge {alpha,beta} {mu} ; % \end{tikzpicture} & \begin{tikzpicture} \node[obs] (xn) {$x_n$}; \node[latent, above=of xn] (mu) {$\mathbf{\mu}$}; \factor[above=of xn] {y-f} {left:${Ber}$} {} {} ; % \node[const, above=1 of mu, xshift=0.5cm] (beta) {$\mathbf{\beta}$}; \node[const, above=1 of mu, xshift=-0.5cm] (alpha) {$\mathbf{\alpha}$}; \factor[above=of mu] {mu-f} {left:${Beta}$} {} {} ; % \plate{}{(xn)}{$n = 1, \cdots, N$}; \edge {mu} {xn} ; % \edge {alpha,beta} {mu-f} ; % \edge {mu-f}{mu} ; % \end{tikzpicture} \end{tabular} \end{center} \caption{Graphical models for a repeated Bernoulli experiment.} \end{figure} \end{document} ```