soravux / scoop

SCOOP (Scalable COncurrent Operations in Python)
https://github.com/soravux/scoop
GNU Lesser General Public License v3.0
625 stars 88 forks source link

Failure trying to distribute work in HPC environment #79

Closed rgov closed 4 years ago

rgov commented 4 years ago

This error is repeated for each node I am trying to distribute work to.

[2019-07-17 14:34:28,536] workerLaunch (127.0.0.1:36053) WARNING Could not successfully launch the remote worker on pn033.
Requested remote group process id, received:
b''
Group id decoding error:
invalid literal for int() with base 10: ''
SSH process stderr:

/.../venv/bin/python: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared
object file: No such file or directory
/.../venv/bin/python: error while loading shared libraries: lib
python3.6m.so.1.0: cannot open shared object file: No such file or directory
/.../venv/bin/python: err
or while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory
/.../venv/bin/python: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file o
r directory
rgov commented 4 years ago

I needed to write a prolog script to load the Python environment module and activate my virtual environment:

module purge
module load default-environment
module load python3/3.6.5

cd "$(dirname "$BASH_SOURCE")"
source venv/bin/activate
kukuwa commented 4 years ago

I have this problem too. Have you solved it

rgov commented 4 years ago

The last comment above was my solution.

kukuwa commented 4 years ago

The last comment above was my solution.

This seems to be a Linux system, Does Windows system can also work?