probml / pml2-book

Probabilistic Machine Learning: Advanced Topics
MIT License
1.39k stars 119 forks source link

Conditional probability model notation p(y|f(x)) #229

Closed maremita closed 1 year ago

maremita commented 1 year ago

Hello,

In section 14.1 Introduction of chapter Predictive models: an overview (2023-01-19 version), it is said:

"... learning to predict outputs $y$ from inputs $x$ using some function $f$ that is estimated from a labeled training set.... We can model our uncertainty about the correct output for a given input using a conditional probability model of the form $p(y | f (x))$."

If the output of the function $f(x)$ is $y$, so $p(y | f (x)) = p(y | y)$.

Moreover, in section 14.1.1, the model is noted by $p(y | x)$, which is the correct notation of the conditional probability known as the posterior probability. However, later in this section, the model is defined again by $p(y | x) = p(y | f (x; \theta))$ where $\theta$ is the set of parameters of the models.

The conditional probability (the model/the posterior) should be noted as $p(y=f (x) | x)$. Also, if $f$ is parameterized by $\theta$, we can note the model/posterior as $p(y=f (x; \theta) | x) = p(y | x; \theta)$.

Please let me know if I overlooked something between the lines.

Thank you.

Kind regards, Amine.

murphyk commented 1 year ago

I'm a bit sloppy when it comes to p(y|x) vs f(x) notation. In the classification case, f(x) often returns the logit vector, rather than y itself, in which case p(y|x) = Cat(y | softmax(f(x)). Or maybe f(x) returns the class probabilities, in which case p(y|x) = Cat(y | f(x). Hope this clarifies things.