Circuitscape / Circuitscape.jl

Algorithms from circuit theory to predict connectivity in heterogeneous landscapes
https://circuitscape.org
MIT License
128 stars 35 forks source link

"KeyError: key ([#],[#]) not found" #260

Closed slamander closed 1 year ago

slamander commented 4 years ago

Hello, CS community!

As a continuation from issue #258, I've encountered another CS error for which I do not know how to address. See error message pasted below:

The relevant files (nodes & resistance layers, init file, and included pair list) are here.

Error: Error happens in Julia.
On worker 3:
KeyError: key (127, 121) not found
getindex at .\dict.jl:467
f at C:\Users\jbaecher\.julia\packages\Circuitscape\XdJRQ\src\core.jl:159
#89 at C:\Users\jbaecher\.julia\packages\Circuitscape\XdJRQ\src\core.jl:223
#106 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:294
run_work_thunk at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:79
macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:294 [inlined]
#105 at .\task.jl:356
Stacktrace:
 [1] (::Base.var"#770#772")(::Task) at .\asyncmap.jl:178
 [2] foreach(::Base.var"#770#772", ::Array{Any,1}) at .\abstractarray.jl:2009
 [3] maptwice(::Function, ::Channel{Any}, ::Array{Any,1}, ::UnitRange{Int64}) at .\asyncmap.jl:178
 [4] wrap_n_exec_twice(::Channel{Any}, ::Array{Any,1}, ::Distributed.var"#206#209"{Distributed.Worker

Thanks in advance for any help or explanation.

Best,

-Alex.

vlandau commented 4 years ago

I'm able to reproduce this. This seems to be an issue with parallel processing plus include_pairs. It did run properly in serial, but I needed to restart Julia to get it to work. For some reason, if you try to run in parallel and get this error, running in serial doesn't work until you restart Julia. Very strange (concerning) behavior.

Summary of observations:

@slamander for now, because your resistance grid is so small, I would suggest running in serial (it is extremely fast) by setting parallelize = false in your .ini. Once you update your .ini file, restart Julia and reload Circuitscape. Also, I would recommend using solver = cholmod in your .ini for problems of this size, < 2 million pixels, as it is much faster that cg+amg (about 10x faster -- it took around 2 seconds to run with CHOLMOD).

@ranjanan is probably the person to look at this (I think he's dealt with some similar issues in the past). I'm just not familiar with the portions of the code that handle parallel processing or include_pairs. I know @ranjanan is very busy these days so he may not have time to look at this for while.

slamander commented 4 years ago

Thank again, @vlandau.

These are all very helpful notes, and--as you mentioned--since this is a very small job (although it is one in several hundred grids I'm applying this too), I'm more than satisfied with this solution.

Stay well out there,

-Alex.

vlandau commented 4 years ago

I'm leaving this open just so we can properly debug, but glad it's workable for you for now at least!

ViralBShah commented 4 years ago

@vlandau We may want to automatically pick cholmod for small problem sizes, and switch to cg+amg on larger ones. What do you think?

vlandau commented 4 years ago

@ViralBShah I like that idea. I'm planning to do something similar for Omniscape -- once CHOLMOD gets implemented for advanced mode :) (or once the PARDISO solver gets released I might use that if it's comparable in performance)

ViralBShah commented 4 years ago

Pardiso wasn't giving better enough performance, from what I could tell in the open PR. Unless you find it does on certain problems, I would suggest sticking with cholmod.

ranjanan commented 4 years ago

@slamander the include pairs file is missing from your link (CMR_CAF.txt). Could you please add that to your Google Drive link?

slamander commented 4 years ago

Hi, @ranjanan. Thanks for the help! Sorry, I made the mistake of altering the files in this folder while troubleshooting upstream aspects of my workflow... If my alterations produce the same error, I'll be sure to post again. If this same error doesn't throw in my updated workflow, I'll try to work backward to recreate it.

In the future, I'll be sure to preserve the code and files for recreating issues when posted here.

ranjanan commented 4 years ago

Alright, no problem. Would you like to reopen this issue if you encounter it again?

slamander commented 4 years ago

Hey, @ranjanan & @ViralBShah. Sorry, I must not have hit 'comment' when I tried to respond earlier. That sounds like a good idea. I'll let you both know what happens after I complete my workflow.

vlandau commented 4 years ago

I believe I still have the files to reproduce. I can post them and reopen if it's okay with @slamander .

slamander commented 4 years ago

@vlandau Fine by me!

vlandau commented 4 years ago

AZE_ARM_CS_inputs.zip Here are the files @ranjanan.

ranjanan commented 4 years ago

@vlandau did you miss uploading the resistance surface? ERROR: the file "resistance_AZE_ARM.asc" does not exist

vlandau commented 4 years ago

Ah shoot, sorry. I uploaded the .out file instead of the resistance surface.... this archive should contain everything. AZE_AR_CS_inputs_v2.zip

ranjanan commented 1 year ago

I don't see this error anymore here, and I can see an output. @vlandau if you get a minute, could you try CS on the latest version of these files to see if the output is correct?

slamander commented 1 year ago

Not sure if this is relevant, but I was running Julia through R and needed to reconfigure my interfacing library to alleviate this issue. I can try to provide more details if necessary.

ranjanan commented 1 year ago

@slamander nice to hear this issue is alleviated. Can you elaborate on the changes? Did you update your R and Julia and JuliaCall?