CliMA / EnsembleKalmanProcesses.jl

Implements Optimization and approximate uncertainty quantification algorithms, Ensemble Kalman Inversion, and Ensemble Kalman Processes.
https://clima.github.io/EnsembleKalmanProcesses.jl/dev
Apache License 2.0
82 stars 18 forks source link

Covariance Localization using Ledoit Wolf #313

Closed odunbar closed 1 year ago

odunbar commented 1 year ago

Ledoit and Wolf 2004, have a nice parameter-free covariance shrinkage estimator.

Applied to the ensemble covariance, this could be an additional way to perform localization/sampling error correction in EKP in a principled way with no additional user parameters.

Implementation

Here is a snippet, which also removes the sample mean with Bessel correction. sample_mat is a matrix of samples of the covariance matrix stored as columns.

function shrinkage_cov(sample_mat::AA) where {AA <: AbstractMatrix}
    n_out, n_sample = size(sample_mat)

    # de-mean (as we will use the samples directly for calculation of β)
    sample_mat_zeromean = sample_mat .- mean(sample_mat, dims=2)
    # Ledoit Wolf shrinkage to I

    # get sample covariance
    Γ = cov(sample_mat_zeromean, dims = 2)
    # estimate opt shrinkage
    μ_shrink = 1/n_out * tr(Γ)
    δ_shrink = norm(Γ - μ_shrink*I)^2 / n_out # (scaled) frob norm of Γ_m
    #once de-meaning, we need to correct the sample covariance with an n_sample -> n_sample-1
    β_shrink = sum([norm(c*c'-   - Γ)^2/n_out for c in eachcol(sample_mat_zeromean)])/ (n_sample-1)^2 

    γ_shrink = min(β_shrink / δ_shrink, 1) # clipping is typically rare
    #  γμI + (1-γ)Γ
    Γ .*= (1-γ_shrink)
    for i = 1:n_out
        Γ[i,i] += γ_shrink * μ_shrink 
    end 

    @info "Shrinkage scale: $(γ_shrink), (0 = none, 1 = revert to scaled Identity)\n shrinkage covariance condition number: $(cond(Γ))"
    return Γ
end        
odunbar commented 1 year ago

Other more recent shrinkage estimators (e.g. increased convergence due to Gaussian assumption) could also help?

odunbar commented 1 year ago

Initial testing on branch orad/sec-ledwol Naive implementation applying the shrinkage estimator to the samples [u;g] leads almost always to shrinkage to the diagonal of C (i.e. [diag(C^uu) , diag(C^GG) for ensemble sizes even near to parameter dimension.) I.e it does not work well for smaller ensembles.

I think due to the augmented state in EKP, the cross-covariance C^uG is the important matrix candidate for localization/correction. Shrinkage however corrects the full covariance matrix C. It also does not make sense to try to shrink C^uG to any type of diagonal, as there is no reason for u_i and G(u)_i to be related.