JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.11k stars 415 forks source link

Can't sample from `MultivariateNormal` when covariance matrix is supposedly not positive definite, but NumPy can #1596

Open ForceBru opened 2 years ago

ForceBru commented 2 years ago

TL;DR: given the same mean vector and covariance matrix, NumPy can sample from a multivariate normal with these parameters, but Distributions.MultivariateNormal says: PosDefException: matrix is not positive definite; Cholesky factorization failed.

Julia code

# gp_sample.jl
begin
    import Pkg
    Pkg.activate(temp=true)
    Pkg.add(
        Pkg.PackageSpec(name="Distributions", version="0.25.65"),
        io=devnull
    )
    Pkg.status()
end

using Distributions
using LinearAlgebra: norm
using DelimitedFiles: writedlm

const AV = AbstractVector{T} where T;
const AM = AbstractMatrix{T} where T;

# ===== Define kernel to compute covariance =====

"RBF kernel between two vectors"
kernel_rbf(x::AV{<:Real}, y::AV{<:Real})::Real =
    exp(-1/2 * norm(x - y)^2)

"RBF between each pair of vectors"
function kernel_rbf(xs::AM{<:Real}, ys::AM{<:Real})
    A, N = size(xs)
    B, M = size(ys)
    @assert A == B

    ret = zeros(N, M)
    for i in axes(xs, 2), j in axes(ys, 2)
        ret[i, j] = kernel_rbf(xs[:, i], ys[:, j])
    end
    ret
end

# ===== Try to sample =====

# 100-D vector
X = collect(range(0, 10, 100))[:, :]'
gp_mean = zeros(size(X, 2)) # mean function
gp_covar = kernel_rbf(X, X) # covariance
writedlm("covariance_jl.csv", gp_covar, ',')

gp_distribution = MultivariateNormal(gp_mean, gp_covar)

@info rand(gp_distribution, 1)

Julia error

Apparently, the covariance matrix gp_covar is not positive definite:

ERROR: LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
 [1] checkpositivedefinite
   @ ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/factorization.jl:18 [inlined]
 [2] cholesky!(A::LinearAlgebra.Hermitian{Float64, Matrix{Float64}}, ::LinearAlgebra.NoPivot; check::Bool)
   @ LinearAlgebra ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:270
 [3] cholesky!(A::Matrix{Float64}, ::LinearAlgebra.NoPivot; check::Bool)
   @ LinearAlgebra ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:302
 [4] #cholesky#164
   @ ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:402 [inlined]
 [5] cholesky (repeats 2 times)
   @ ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:402 [inlined]
 [6] PDMat
   @ ~/.julia/packages/PDMats/ZW0lN/src/pdmat.jl:19 [inlined]
 [7] MvNormal(μ::Vector{Float64}, Σ::Matrix{Float64})
   @ Distributions ~/.julia/packages/Distributions/HAuAd/src/multivariate/mvnormal.jl:201
 [8] top-level scope
   @ gp_sample.jl:45
in expression starting at gp_sample.jl:45

Equivalent Python code

# gp_sample.py
import numpy as np

def kernel_rbf(x, y):
    return np.exp(-1/2 * np.linalg.norm(x - y)**2)

def kernel_rbf_mat(xs, ys):
    assert xs.shape[0] == ys.shape[0]
    ret = np.zeros((xs.shape[1], ys.shape[1]))

    # Same loop as in Julia
    for i in range(xs.shape[1]):
        for j in range(ys.shape[1]):
            ret[i, j] = kernel_rbf(xs[:, i], ys[:, j])

    return ret

X = np.linspace(0, 10, 100).reshape((1, -1))
gp_mean = np.zeros(X.shape[1])
gp_covar = kernel_rbf_mat(X, X)
np.savetxt("covariance_py.csv", gp_covar, delimiter=',')

sample = np.random.multivariate_normal(
    gp_mean, gp_covar, size=2
)

print(sample)

Problem

How to reproduce:

  1. Run Julia code: julia-1.8 gp_sample.jl. The covariance matrix will be saved to covariance_jl.csv.
  2. Run Python code: python3 gp_sample.py. The covariance matrix will be saved to covariance_py.csv.
  3. Check that the two covariance matrices are the same:
    >>> import numpy as np
    >>> cov_jl = np.loadtxt("covariance_jl.csv", delimiter=',')
    >>> cov_py = np.loadtxt("covariance_py.csv", delimiter=',')
    >>> np.allclose(cov_jl, cov_py)
    True

However, given the same parameters, Python can sample from the multivariate normal no problem, but Distributions.jl can't.

Also, NumPy apparently doesn't care that the covariance matrix gp_covar isn't positive definite (is it actually not positive definite?? I'm not sure anymore) and samples anyway, while Distributions.jl thinks that it's a problem.

This code is supposed to sample from a Gaussian process (here's the tutorial I'm following: https://peterroelants.github.io/posts/gaussian-process-tutorial/#Sampling-from-prior), so it... should work?

Simpler MWE

Actually, the kernel stuff isn't really needed to reproduce the bug. The covariance matrix can simply be loaded from covariance_py.csv:

julia> using Distributions, DelimitedFiles
julia> MultivariateNormal(zeros(100), readdlm("covariance_py.csv", ','))
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.
Stacktrace:
 [1] checkpositivedefinite
   @ ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/factorization.jl:18 [inlined]
 [2] cholesky!(A::LinearAlgebra.Hermitian{Float64, Matrix{Float64}}, ::LinearAlgebra.NoPivot; check::Bool)
   @ LinearAlgebra ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:270
 [3] cholesky!(A::Matrix{Float64}, ::LinearAlgebra.NoPivot; check::Bool)
   @ LinearAlgebra ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:302
 [4] #cholesky#164
   @ ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:402 [inlined]
 [5] cholesky (repeats 2 times)
   @ ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:402 [inlined]
 [6] PDMat
   @ ~/.julia/packages/PDMats/ZW0lN/src/pdmat.jl:19 [inlined]
 [7] MvNormal(μ::Vector{Float64}, Σ::Matrix{Float64})
   @ Distributions ~/.julia/packages/Distributions/HAuAd/src/multivariate/mvnormal.jl:201
 [8] top-level scope
   @ REPL[6]:1

julia> 

Versions

ForceBru commented 2 years ago

I used brute force to test whether the matrix is positive definite with the Sylvester's criterion:

using LinearAlgebra

gp_covar = kernel_rbf_mat(X, X)
determinants = @views [
    det(gp_covar[begin:i, begin:i])
    for i in axes(gp_covar, 1)
]

This gives me many zeros and some negative determinants that are really close to zero:

100-element Vector{Float64}:
  1.0
  0.010151166063869232
  2.0814591040367652e-6
  1.2865923518518836e-11
  3.1803252426530575e-18
  3.909951852971926e-26
  2.8544529806688935e-35
  1.4394891501197602e-45
  6.387850389715162e-57
 -1.5246070532959875e-68
  3.315279979574806e-80
  2.6577421180681176e-93
 -2.437546507932742e-104
  ⋮
  0.0
 -0.0
 -0.0
 -0.0
 -0.0
  0.0
 -0.0
 -0.0
  0.0
  0.0
 -0.0
  0.0

So yes, the covariance matrix is not positive definite. However, NumPy's multivariate_normal works fine anyway. In fact, its documentation says (emphasis mine):

Covariance matrix of the distribution. It must be symmetric and positive-semidefinite for proper sampling.

Wikipedia also says in the "Equivalent definitions" section:

A random vector ... has a multivariate normal distribution if ...:

There is a k-vector mu and a symmetric, positive semidefinite kxk matrix sigma...

So, looks like the covariance matrix should be positive semidefinite.

I'd say my covariance matrix is in fact positive semidefinite: numbers like -1.52e-68, -2.4375e-104 and -0.0 are essentially zeros.

devmotion commented 2 years ago

https://github.com/JuliaStats/Distributions.jl/issues/1602#issuecomment-1209901969 explains how one can use positive semi-definite matrices (to some extent, i.e., e.g. for rand).