ermongroup / cs228-notes

Course notes for CS228: Probabilistic Graphical Models.
MIT License
1.9k stars 471 forks source link

Suggested edit to message passing notation in junction tree notes #198

Open apappu97 opened 4 years ago

apappu97 commented 4 years ago

I believe in line with the notation earlier in the notes and discussion of message passing, the sentence "The factor $$\tau_j(xj)$$ can be thought of as a message that ..." should instead be "The factor $$\tau{jk}(x_k)$$ ..." as we've summed over the values of variable $$x_j$$, leaving a new factor that is a function of $x_k$, i.e. the message from $$x_j$$ to $$x_k$$.

chrisyeh96 commented 4 years ago

"The factor $$\tau_j(x_j)$$" is actually correct as written. $$\tau_j(x_j)$$ indeed summarizes all of the information from the subtree rooted at $$xj$$. I have made this notation more consistent by rewriting $$\tau{jk}(x_k)$$ as $$\tau_k(x_k)$$.

apappu97 commented 4 years ago

Oh okay, thanks for clarifying. And thanks for all of the effort on these notes, they are wonderful!

To followup: isn't the subscript redundant then? As a reader, and having cross referenced with other message passing notes, the double index subscript feels much clearer, as it makes explicit the destination and source nodes in the message passing algorithm.

Additionally, the single index subscript notation seems inconsistent with the explicit $$m_{i -> j}$$ notation used later on once the $$\tau$$ notation is replaced by messages, as foreshadowed in Line 21 (copying from your edit):

"The factor $$\tau_j(x_j)$$ can be thought of as a message that $$x_j$$ sends to $$x_k$$ that summarizes all of the information from the subtree rooted at $$x_j$$." ---> this is a bit unclear as it doesn't have explicit $$x_k$$ dependence, and showing the dependence on the 'sender' node $$x_j$$ isn't possible without a double subscript.

Also on line 21: "At each step, we will eliminate $$x_j$$; this will involve computing the factor $$\tau_k(xk) = \sum{x_j} \phi(x_k, x_j) \tau_j(x_j)$$, where $$x_k$$ is the parent of $$x_j$$ in the tree." ---> If $$\tau_k(x_k)$$ indeed is all of the information from the subtree rooted at $$x_k$$, the notation as used in this line precludes there existing multiple children of node $$xk$$. I think this would also be clearer if the original notation of $$\tau{jk}$$ was kept to make explicit the child-parent/sender-receiver relationship.

If you agree, I'm happy to make changes. Thanks for looking!

chrisyeh96 commented 4 years ago

Thanks for taking the time to read these notes so carefully! Your point about potentially having multiple children is something I overlooked, so I'm going to re-open this pull request. Will come back and think through this more thoroughly. I think there is significant room to improve this section of the notes. Just a small example: the notes on VE never mention "forming cliques" of a certain size and instead talks about the size of factor scopes. I'll take a deeper pass through this when I have time. Feel free to continue making changes as you see fit and I'll see what makes sense to incorporate!