algorithmsbooks / optimization

Errata for Algorithms for Optimization book
68 stars 16 forks source link

pg. 287 #83

Closed alextzik closed 2 years ago

alextzik commented 2 years ago

The distribution we are determining using MLE, is not a conditional, but rather a joint over y, with parameters X. It should thus be p(y:X, theta) and not p(y|X, theta), right?

tawheeler commented 2 years ago

I think they end up being the same because P(y,X | theta) = P(y | X, theta) P(X | theta) and P(X | theta) is uniform.

Here is another snippet from Guassian Processes for Machine Learning: Selection_300

We don't control X in any way - we just want every X that we do have to be paired with a predictable y.