estimate_cov for subset and subset infinity sometimes returns complex number

AnderGray commented 8 months ago

Performing convergence study for MC, subset, and subset infinity.

Sometimes (randomly) estimate_cov is trying to take the sqrt of a complex number. Script for the study

using UncertaintyQuantification, DelimitedFiles, Formatting
using Distributions, Plots

###
#   Define input random variables
###
X1 = RandomVariable(Normal(0, 1), :X1)
X2 = RandomVariable(Normal(0, 1), :X2)

inputs = [X1, X2]

###
#   Define function, and model
###
function y(df)
    return df.X1 .+ df.X2
end

m = Model(y, :y)

###
#   Define limitstate. Failure is limitstate <= 0
###
function limitstate(df)
    return 4 .- reduce(vcat, df.y)
end

###
#   True failure for this problem
###
True_failure_prob = 1 - cdf(Normal(), 4/sqrt(2))

N_samples = 10 .^range(2, 5, 50)

pf_1 = zeros(length(N_samples))
pf_2 = zeros(length(N_samples))
pf_3 = zeros(length(N_samples))

cov_1 = zeros(length(N_samples))
cov_2 = zeros(length(N_samples))
cov_3 = zeros(length(N_samples))

for (i, N) in enumerate(N_samples)
    println(i)
    ###
    #   Try different simulation methods
    ###
    simulation_method_1 = MonteCarlo(Integer(round(N * 3)))
    simulation_method_2 = SubSetSimulation(Integer(round(N)), 0.1, 10, Normal())
    simulation_method_3 = SubSetInfinity(Integer(round(N)), 0.1, 10, 0.5)

    ###
    #   Simulate
    ###
    pf_1[i], cov_1[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_1)
    pf_2[i], cov_2[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_2)
    pf_3[i], cov_3[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_3)

end

plot(N_samples, pf_1)
plot!(N_samples, pf_2)
plot!(N_samples, pf_3)
plot!(N_samples, ones(length(N_samples)) * True_failure_prob,  xaxis=:log10)

plot(N_samples, cov_1)
plot!(N_samples, cov_2)
plot!(N_samples, cov_3, xaxis=:log10)

ERROR: LoadError: DomainError with -0.0004979094436062081:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(f::Symbol, x::Float64)
   @ Base.Math ./math.jl:33
 [2] sqrt
   @ ./math.jl:677 [inlined]
 [3] estimate_cov(Iᵢ::BitMatrix, pf::Float64, n::Int64)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:293
 [4] probability_of_failure(models::Model, performancefunction::typeof(limitstate), inputs::Vector{RandomVariable}, sim::SubSetInfinity)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:130
 [5] top-level scope
   @ ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/demo/Damask/example_convergance/example_convergance.jl:58
 [6] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [7] top-level scope
   @ REPL[19]:1
in expression starting at

FriesischScott commented 8 months ago

I'll investigate.

FriesischScott commented 8 months ago

Next time please just paste the code here instead of uploading a zip file.

FriesischScott commented 7 months ago

Turns out the issue here is using a sample size where N*target leads to less samples in the levels >=2. For example using 205 and 0.1 leads to 189 samples in the subsets because of the way the number of chains and samples per chain are calculated

number_of_seeds = Int64(max(1, ceil(sim.n * sim.target)))
samples_per_seed = Int64(floor(sim.n / number_of_seeds))

here 21 seeds and 9 chains = 189 != 205.

However the sample size passed to the cov estimation is always n.

This is easily fixed by computing n for the estimation from the available samples.

FriesischScott / UncertaintyQuantification.jl

estimate_cov for subset and subset infinity sometimes returns complex number #138