biona001 / Knockoffs.jl

Variable Selection with Knockoffs
MIT License
6 stars 0 forks source link

SDP solver a lot slower than Matlab/R equivalent #18

Open biona001 opened 2 years ago

biona001 commented 2 years ago

Currently the SDP solver in Knockoffs.jl is ~2 times slower than knockoff package in R and ~4 times slower than the knockoff package in Matlab. Speed difference is entirely due to underlying solver (Hypatia SDP solver+ Convex.jl in Julia vs R's Rdsdp vs Matlab's SDPT3 solver)

How do we could get more performance?

MWE

using Revise
using Knockoffs
using Test
using LinearAlgebra
using Random
using StatsBase
using Statistics
using Distributions
using ToeplitzMatrices
using RCall
using PositiveFactorizations
using UnicodePlots
using MATLAB
using SCS
using JuMP
using Convex

# simulate data
Random.seed!(2022)
n = 100
p = 400
ρ = 0.4
Sigma = Matrix(SymmetricToeplitz(ρ.^(0:(p-1))))
L = cholesky(Sigma).L
X = randn(n, p) * L # var(X) = L var(N(0, 1)) L' = var(Σ)
true_mu = zeros(p)

Knockoffs.jl

@time knockoff = modelX_gaussian_knockoffs(X, :sdp, true_mu, Sigma) # precompile
s = knockoff.s
  2.686531 seconds (507.05 k allocations: 399.686 MiB, 2.05% gc time)

Compare with Matteo's knockoffs in R

@rput Sigma X true_mu
@time begin
    R"""
    library(knockoff)
    r_s = create.solve_sdp(Sigma)
    X_ko = create.gaussian(X, true_mu, Sigma, method = "sdp", diag_s=r_s)
    """
end
@rget r_s X_ko
  1.208416 seconds (54 allocations: 1.328 KiB)

Compare with Matteo's knockoffs in Matlab

# compare with Matteo's knockoffs in MATLAB
@mput Sigma X true_mu
@time begin
    mat"""
    addpath("/Users/biona001/Documents/MATLAB/knockoffs_matlab")
    matlab_s = knockoffs.create.solveSDP(Sigma)
    """
end
@mget matlab_s
  0.715965 seconds (7 allocations: 192 bytes)

At least the answer agrees:

[s r_s matlab_s]
200×3 Matrix{Float64}:
 1.0       1.0       1.0
 0.971429  0.971427  0.971428
 0.811429  0.811431  0.811429
 0.875429  0.875427  0.875428
 0.849829  0.849829  0.849829
 0.860069  0.860068  0.860068
 0.855973  0.855973  0.855973
 0.857611  0.857611  0.857611
 0.856956  0.856956  0.856956
 0.857218  0.857218  0.857218
 0.857113  0.857113  0.857113
 0.857155  0.857155  0.857155
 0.857138  0.857138  0.857138
 ⋮                   
 0.857155  0.857155  0.857155
 0.857113  0.857113  0.857113
 0.857218  0.857218  0.857218
 0.856956  0.856956  0.856956
 0.857611  0.857611  0.857611
 0.855973  0.855973  0.855973
 0.860069  0.860068  0.860068
 0.849829  0.849829  0.849829
 0.875429  0.875427  0.875428
 0.811429  0.811431  0.811429
 0.971429  0.971427  0.971428
 1.0       1.0       1.0
biona001 commented 2 years ago

Related post on discourse a while ago: https://discourse.julialang.org/t/solving-sdp-problem-is-100x-slower-than-matlab-cvx/79704