As of 92fc8d950563448c5faf1fab7a939c1d6c773e9e, our elegant code for multiprocessing doesn't run.
I took to following steps to understand what's going on:
When running code with the original @distributed for, useless error messages were resulted (for unknown reason processes exit or are killed). When testing in AKR, fglo is untouched by multiprocessing (constant -1). in the testing environment it works if there is no multithreading. Interestingly, recompiling the relevant function in Momentmatching ruins this, unless a new julia session is opened.
Internet told me: if interested in errors, @spawnat is better, since fetch also fetches error messages from tasks. so I rewrote the code (only applied on one worker) with @spawnat for debugging purposes and commented out @distributed. I received #X#Y variable is not defined on worker 2. couldn't trace which variable
tried commenting out everything in @spawnat that needs any external code. still doesn't work when recompiled. Similar error as before. When fetch line is commented out it seems the whole @spawnat block is ignored.
module A
using Distributed
procs = addprocs(10)
f = @spawnat procs[1] begin
println("got here") # never printed
return 1
end
#fetch(f)
end
"got here" is never printed
when fetch(f) is uncommented, an error says module A is not available on worker 2
The problem is that process inside doesn't know what is outside. Couldn't implement their fix (their situation looks simpler anyways).
Conclusion
I think our problem is that this outside world of processes should be BOTH MomentMatching and AKR (or whatever application). Hopefully I'm wrong.
In any case, we would need to know more about modules, environments, etc. When having time, a good idea would be to write a minimal example for our problem and ask on julia forums.
As of 92fc8d950563448c5faf1fab7a939c1d6c773e9e, our elegant code for multiprocessing doesn't run.
I took to following steps to understand what's going on:
Found this related thread: https://discourse.julialang.org/t/error-running-distributed-code-inside-of-a-module/54283 suggesting the minimal example below for the last point above.
Minimal example
The problem is that process inside doesn't know what is outside. Couldn't implement their fix (their situation looks simpler anyways).
Conclusion
I think our problem is that this outside world of processes should be BOTH MomentMatching and AKR (or whatever application). Hopefully I'm wrong.
In any case, we would need to know more about modules, environments, etc. When having time, a good idea would be to write a minimal example for our problem and ask on julia forums.