xtreme-d / docker-slurm-cluster

Simple Slurm cluster in docker.
MIT License
9 stars 6 forks source link

Slurmctld, slurmd and slurmdbd service is FATAL. #2

Open squaresoft2015 opened 2 years ago

squaresoft2015 commented 2 years ago

I installed the docker-slurm-cluster on centOS 7.9 with ARM architecture server. Everything is fine until the docker-compose up -d command is executed. After I exec the "docker exec -it axc-headnode bash" command, there are five warnings: bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory and I discover the slurmctld service is not running with "supervisorctl status" command: munged RUNNING pid 37, uptime 13:37:47 slurmctld FATAL Exited too quickly (process log may have details) slurmd FATAL Exited too quickly (process log may have details) slurmdbd FATAL Exited too quickly (process log may have details) slurmdbd_init__oneshot RUNNING pid 22461, uptime 0:00:04

What's the problem? Thanks in advance.

hackprime commented 2 years ago

Hi @squaresoft2015 ,

bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory

It seems like the locale settings in your system aren't set properly. Maybe you need to try to do this or look for another solution.

FATAL Exited too quickly (process log may have details)

If this issue still appears after you resolve "setlocale" issue, please show me the output of docker logs axc-headnode

m0rfeo commented 2 years ago

@squaresoft2015

Add to your Dockerfile as firts run instruction:

RUN echo LANG=en_US.utf-8 >> /etc/environment && echo LC_ALL=en_US.utf-8 >> /etc/environment

squaresoft2015 commented 2 years ago

Thanks! I have swtiched to docker-centos7-slurm. I will come back to this project after the test on that one is completed.

m0rfeo commented 2 years ago

@squaresoft2015 You have to consider that Slurm 20.02.7 (version functional on this proyect) has been removed from Slurm official website because is vulnerable. Try another version or download the source from another site, you can find on my repository or alternatively change original wget on Dockerfile for: wget https://193.219.28.154/packages/slurm/slurm-${SLURM_VERSION}.tar.bz2 --no-check-certificate && \