Circuitscape / Omniscape.jl

Functions to compute omnidirectional landscape connectivity using circuit theory and the Omniscape algorithm.
https://docs.circuitscape.org/Omniscape.jl/stable/
MIT License
58 stars 12 forks source link

Omniscape crashes at end of run #108

Closed glaroc closed 2 years ago

glaroc commented 3 years ago

Hi, I'm running Omniscape on a bunch of files and for a very limited number of them, I get the error below. Is this a familiar error ? I looked at the resistance surface that leads to this error and I can't see anything abnormal. Other files created in the exact same way run ok. Any clues ?


`Progress:  99%|█████████████████████████████████████████████████▋|  ETA: 0:00:26ERROR: TaskFailedException
Stacktrace:
 [1] wait
   @ ./task.jl:322 [inlined]
 [2] threading_run(func::Function)
   @ Base.Threads ./threadingconstructs.jl:34
 [3] macro expansion
   @ ./threadingconstructs.jl:93 [inlined]
 [4] run_omniscape(cfg::Dict{String, String}, resistance::Matrix{Union{Missing, Float64}}; reclass_table::Matrix{Union{Missing, Float64}}, source_strength::Matrix{Union{Missing, Float64}}, condition1::Matrix{Union{Missing, Float64}}, condition2::Matrix{Union{Missing, Float64}}, condition1_future::Matrix{Union{Missing, Float64}}, condition2_future::Matrix{Union{Missing, Float64}}, wkt::String, geotransform::Vector{Float64}, write_outputs::Bool)
   @ Omniscape ~/.julia/packages/Omniscape/0AlRt/src/main.jl:268
 [5] run_omniscape(path::String)
   @ Omniscape ~/.julia/packages/Omniscape/0AlRt/src/main.jl:560
 [6] top-level scope
   @ REPL[2]:1

    nested task error: AssertionError: norm(G * v - curr) < if eltype(curr) == Float64
            TOL_DOUBLE
        else
            TOL_SINGLE
        end
    Stacktrace:
     [1] solve_linear_system(G::SparseArrays.SparseMatrixCSC{Float64, Int64}, curr::Vector{Float64}, M::AlgebraicMultigrid.Preconditioner{AlgebraicMultigrid.MultiLevel{AlgebraicMultigrid.Pinv{Float64}, AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, AlgebraicMultigrid.GaussSeidel{AlgebraicMultigrid.SymmetricSweep}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, LinearAlgebra.Adjoint{Float64, SparseArrays.SparseMatrixCSC{Float64, Int64}}, AlgebraicMultigrid.MultiLevelWorkspace{Vector{Float64}, 1}}})
       @ Circuitscape ~/.julia/packages/Circuitscape/Qr6wW/src/core.jl:616
     [2] macro expansion
       @ ./timing.jl:287 [inlined]
     [3] multiple_solve(s::Circuitscape.AMGSolver, matrix::SparseArrays.SparseMatrixCSC{Float64, Int64}, sources::Vector{Float64}, suppress_info::Bool)
       @ Circuitscape ~/.julia/packages/Circuitscape/Qr6wW/src/raster/advanced.jl:311
     [4] multiple_solver(cfg::Dict{String, String}, solver::Circuitscape.AMGSolver, a::SparseArrays.SparseMatrixCSC{Float64, Int64}, sources::Vector{Float64}, grounds::Vector{Float64}, finitegrounds::Vector{Float64})
       @ Circuitscape ~/.julia/packages/Circuitscape/Qr6wW/src/raster/advanced.jl:291
     [5] calculate_current(conductance::Matrix{Union{Missing, Float64}}, source::Matrix{Union{Missing, Float64}}, ground::Matrix{Float64}, cs_flags::Circuitscape.RasterFlags, cs_cfg::Dict{String, String}, T::DataType)
       @ Omniscape ~/.julia/packages/Omniscape/0AlRt/src/utils.jl:410
     [6] solve_target!(i::Int64, n_targets::Int64, int_arguments::Dict{String, Int64}, targets::Matrix{Float64}, source_strength::Matrix{Union{Missing, Float64}}, resistance::Matrix{Union{Missing, Float64}}, os_flags::Omniscape.OmniscapeFlags, cs_cfg::Dict{String, String}, cs_flags::Circuitscape.RasterFlags, o::Circuitscape.OutputFlags, condition1_present::Matrix{Union{Missing, Float64}}, condition1_future::Matrix{Union{Missing, Float64}}, condition2_present::Matrix{Union{Missing, Float64}}, condition2_future::Matrix{Union{Missing, Float64}}, comparison1::String, comparison2::String, condition1_lower::Float64, condition1_upper::Float64, condition2_lower::Float64, condition2_upper::Float64, correction_array::Matrix{Float64}, cum_currmap::Array{Float64, 3}, fp_cum_currmap::Array{Float64, 3}, precision::DataType)
       @ Omniscape ~/.julia/packages/Omniscape/0AlRt/src/utils.jl:497
     [7] macro expansion
       @ ~/.julia/packages/Omniscape/0AlRt/src/main.jl:273 [inlined]
     [8] (::Omniscape.var"#185#threadsfor_fun#15"{Int64, ProgressMeter.Progress, Circuitscape.RasterFlags, Circuitscape.OutputFlags, Int64, Dict{String, String}, Float64, Float64, Float64, Float64, String, String, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}})(onethread::Bool)
       @ Omniscape ./threadingconstructs.jl:81
     [9] (::Omniscape.var"#185#threadsfor_fun#15"{Int64, ProgressMeter.Progress, Circuitscape.RasterFlags, Circuitscape.OutputFlags, Int64, Dict{String, String}, Float64, Float64, Float64, Float64, String, String, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}})()
       @ Omniscape ./threadingconstructs.jl:48
`
vlandau commented 3 years ago

Hi @glaroc. Updating to the latest versions of Omniscape and Circuitscape should solve this for you. This was due to an issue in Circuitscape that has since been fixed as of 5.8.3.

glaroc commented 3 years ago

@vlandau Thank you, I'll try that ! Out of curiosity, I'm struggling to install 0.5.3 because of the Pardiso.jl package. Is that a new requirement ?

vlandau commented 3 years ago

That is a new requirement of the Circuitscape.jl package, since we added a new solver (but the solver only applies if you compile Julia with MKL anyway). Are you getting a specific error?

glaroc commented 3 years ago

This is on a cluster, so I don't have complete control on the Julia installation.

julia> using Pardiso
ERROR: InitError: could not load library "libmkl_rt"
libmkl_rt.so: cannot open shared object file: No such file or directory
vlandau commented 3 years ago

Hmm, looks like @ranjanan found similar issues when he was implementing the Pardiso solver. This issue might help: https://github.com/JuliaSparse/Pardiso.jl/issues/77

If you're not using the latest version of Julia, updating might help as well.

glaroc commented 3 years ago

I tried deleting my .julia folder and reinstalling everyting and it didn't fix it. I can't really control the Julia version (1.6.1) since this is on a HPC cluster.

vlandau commented 3 years ago

I'm guessing the OS is Linux in that case?

glaroc commented 3 years ago

Yes, on Linux.

vlandau commented 3 years ago

It may be worth opening a new issue in Pardiso.jl, as this is a problem with that package's install. In the mean time, I have reached out to Ranjan to see if he has any insight and will ask him to post here if he does.

glaroc commented 3 years ago

Great, thank you so much! This might also have something to do with the environment variables being set differently on the cluster. I'll keep digging.

ranjanan commented 3 years ago

@glaroc once you reinstall Circuitscape, could you try the minimal reproducer in https://github.com/JuliaSparse/Pardiso.jl/issues/77#issue-939023263? Just so I know it's the same issue. I shall also update the bounds in Circuitscape to be more strict.

ranjanan commented 3 years ago

You can also try that reproducer after updating Circuitscape to v5.8.4 (https://github.com/JuliaRegistries/General/pull/46641#event-5453698427). You shouldn't get that error anymore.

glaroc commented 3 years ago

@ranjanan With v5.8.4, I can't load Circuitscape still because of Pardiso failing to load.

ranjanan commented 3 years ago

Is it the same error as in https://github.com/JuliaSparse/Pardiso.jl/issues/77#issue-939023263?

glaroc commented 3 years ago

https://github.com/Circuitscape/Omniscape.jl/issues/108#issuecomment-941415246

ranjanan commented 3 years ago

Is the environment variable MKL_ROOT set?

glaroc commented 3 years ago

Yes, it is set.

ranjanan commented 3 years ago

And what is it set to? the cluster's local MKL? If that's the case, then maybe try https://github.com/JuliaSparse/Pardiso.jl#mkl-pardiso? (unless you already have, in which case, did Pardiso end up pointing there?)

glaroc commented 3 years ago

Ok, if I unload the imkl module provided by the cluster before installing Pardiso.jl, everything works !

ranjanan commented 3 years ago

I see! I'm glad to hear that worked. It's still unfortunate that your Circuitscape did not work out of the box with your local MKL. From interacting with your cluster, do you have a suggestion for how I can fix the default behavior?

glaroc commented 3 years ago

I don't know what the issue is. It seems like Pardiso.jl is seeing the MKLROOT folder on build since it decides not to install MKL. However, when the Pardiso library is reloaded, it is not seeing the proper MKLROOT folder, as if it was searching for it within the local Julia install.

vlandau commented 2 years ago

@glaroc would it be okay for me to close this issue?

glaroc commented 2 years ago

@vlandau Yes, absolutely.