Closed maremita closed 1 year ago
I'm a bit sloppy when it comes to p(y|x) vs f(x) notation. In the classification case, f(x) often returns the logit vector, rather than y itself, in which case p(y|x) = Cat(y | softmax(f(x)). Or maybe f(x) returns the class probabilities, in which case p(y|x) = Cat(y | f(x). Hope this clarifies things.
Hello,
In section 14.1 Introduction of chapter Predictive models: an overview (2023-01-19 version), it is said:
"... learning to predict outputs $y$ from inputs $x$ using some function $f$ that is estimated from a labeled training set.... We can model our uncertainty about the correct output for a given input using a conditional probability model of the form $p(y | f (x))$."
If the output of the function $f(x)$ is $y$, so $p(y | f (x)) = p(y | y)$.
Moreover, in section 14.1.1, the model is noted by $p(y | x)$, which is the correct notation of the conditional probability known as the posterior probability. However, later in this section, the model is defined again by $p(y | x) = p(y | f (x; \theta))$ where $\theta$ is the set of parameters of the models.
The conditional probability (the model/the posterior) should be noted as $p(y=f (x) | x)$. Also, if $f$ is parameterized by $\theta$, we can note the model/posterior as $p(y=f (x; \theta) | x) = p(y | x; \theta)$.
Please let me know if I overlooked something between the lines.
Thank you.
Kind regards, Amine.