ddbourgin / numpy-ml

Machine learning, in numpy
https://numpy-ml.readthedocs.io/
GNU General Public License v3.0
15.31k stars 3.71k forks source link

[Question]How to understand the implement detail of BayesianLinearRegression ? latex updated #54

Closed eromoe closed 4 years ago

eromoe commented 4 years ago

I am learning the implement of BayesianLinearRegression through numpy-ml project

I copy the code here

class BayesianLinearRegressionKnownVariance:
    def __init__(self, b_mean=0, b_sigma=1, b_V=None, fit_intercept=True):
        r"""
        Bayesian linear regression model with known error variance and
        conjugate Gaussian prior on model parameters.

        Notes
        -----
        Uses a conjugate Gaussian prior on the model coefficients. The
        posterior over model parameters is

        .. math::

            b \mid b_{mean}, \sigma^2, b_V \sim \mathcal{N}(b_{mean}, \sigma^2 b_V)

        Ridge regression is a special case of this model where :math:`b_{mean}`
        = 0, :math:`\sigma` = 1 and `b_V` = I (ie., the prior on `b` is a
        zero-mean, unit covariance Gaussian).

        Parameters
        ----------
        b_mean : :py:class:`ndarray <numpy.ndarray>` of shape `(M,)` or float
            The mean of the Gaussian prior on `b`. If a float, assume `b_mean` is
            ``np.ones(M) * b_mean``. Default is 0.
        b_sigma : float
            A scaling term for covariance of the Gaussian prior on `b`. Default
            is 1.
        b_V : :py:class:`ndarray <numpy.ndarray>` of shape `(N,N)` or `(N,)` or None
            A symmetric positive definite matrix that when multiplied
            element-wise by `b_sigma^2` gives the covariance matrix for the
            Gaussian prior on `b`. If a list, assume ``b_V = diag(b_V)``. If None,
            assume `b_V` is the identity matrix. Default is None.
        fit_intercept : bool
            Whether to fit an intercept term in addition to the coefficients in
            b. If True, the estimates for b will have `M + 1` dimensions, where
            the first dimension corresponds to the intercept. Default is True.
        """
        # this is a placeholder until we know the dimensions of X
        b_V = 1.0 if b_V is None else b_V

        if isinstance(b_V, list):
            b_V = np.array(b_V)

        if isinstance(b_V, np.ndarray):
            if b_V.ndim == 1:
                b_V = np.diag(b_V)
            elif b_V.ndim == 2:
                fstr = "b_V must be symmetric positive definite"
                assert is_symmetric_positive_definite(b_V), fstr

        self.posterior = {}
        self.posterior_predictive = {}

        self.b_V = b_V
        self.b_mean = b_mean
        self.b_sigma = b_sigma
        self.fit_intercept = fit_intercept

    def fit(self, X, y):
        """
        Compute the posterior over model parameters using the data in `X` and
        `y`.

        Parameters
        ----------
        X : :py:class:`ndarray <numpy.ndarray>` of shape `(N, M)`
            A dataset consisting of `N` examples, each of dimension `M`.
        y : :py:class:`ndarray <numpy.ndarray>` of shape `(N, K)`
            The targets for each of the `N` examples in `X`, where each target
            has dimension `K`.
        """
        # convert X to a design matrix if we're fitting an intercept
        if self.fit_intercept:
            X = np.c_[np.ones(X.shape[0]), X]

        N, M = X.shape
        self.X, self.y = X, y

        if is_number(self.b_V):
            self.b_V *= np.eye(M)

        if is_number(self.b_mean):
            self.b_mean *= np.ones(M)

        b_V = self.b_V
        b_mean = self.b_mean
        b_sigma = self.b_sigma

        b_V_inv = np.linalg.inv(b_V)
        L = np.linalg.inv(b_V_inv + X.T @ X)
        R = b_V_inv @ b_mean + X.T @ y

        # (b_v^{-1} + X^{\top}X)^{-1} @ (b_v^{-1}@b_mean + X^{\top}y)

        mu = L @ R
        cov = L * b_sigma ** 2

        # posterior distribution over b conditioned on b_sigma
        self.posterior["b"] = {"dist": "Gaussian", "mu": mu, "cov": cov}

latex doesn't display well in github , so I pasted a picture image


Also can look to this https://stats.stackexchange.com/questions/477618/understand-the-implement-detail-of-bayesianlinearregression-in-python

tanujdhiman commented 4 years ago

Read this article by this you can understand all the details of implement detail of BayesianLinearRegression: https://towardsdatascience.com/bayesian-linear-regression-in-python-using-machine-learning-to-predict-student-grades-part-2-b72059a8ac7e

ddbourgin commented 4 years ago

As a general rule, I'd like to reserve Github issues for identifying errors in the codebase / making feature requests. I'm not able to provide a detailed explanation of the logic here, but you may find the following page helpful: https://www.statlect.com/fundamentals-of-statistics/Bayesian-regression