slimgroup / JUDI.jl

Julia Devito inversion.
https://slimgroup.github.io/JUDI.jl
MIT License
96 stars 30 forks source link

parallel problem when running on multi node #108

Closed WYJLCYWHZ closed 2 years ago

WYJLCYWHZ commented 2 years ago

Hi, I am new to parallel computing and I guess my situation is interesting for others, at least in terms of education. I try to run "[JUDI.jl](/examples/compressive_splsrtm/Sigsbee2A/rtm_sigsbee.jl" example on the cluster using 6 nodes,each contains 16 cores. I use PBS to submit my commands, but seem like parallel computing is not happening. Only one node (node16) is assigned to the task of parallel computing. 16

While other nodes are not assigned to the task. 1

Is there anything else I lose? Thank you very much.(^▽^)

mloubout commented 2 years ago

How are you submitting the job? If you are using PBS you should be using Julia's PBS ClusterManager, i.e check https://github.com/JuliaParallel/ClusterManagers.jl

If you don't then julia isn't aware of your PBS setup, only the node on which the program is executed so it won't make use of the parallel resources.

WYJLCYWHZ commented 2 years ago

How are you submitting the job? If you are using PBS you should be using Julia's PBS ClusterManager, i.e check https://github.com/JuliaParallel/ClusterManagers.jl

If you don't then julia isn't aware of your PBS setup, only the node on which the program is executed so it won't make use of the parallel resources.

Oh! Sorry, I missed this setting. Thank you for your answer. (^▽^)

WYJLCYWHZ commented 2 years ago

Hi, Do you have an example of submitting tasks using ClusterManagers? Looks like I haven't been able to run him successfully.