giovtorres / slurm-docker-cluster

A Slurm cluster using docker-compose
MIT License
286 stars 170 forks source link

Loading archive data to slurmdb in slurmdbd container #31

Open mdefende opened 1 year ago

mdefende commented 1 year ago

Hi,

Firstly, thanks for creating the docker images for Slurm, it was much easier to set up this way. The main reason I wanted to set up Slurm is to read in some job archive data from my university's cluster for some metrics. I went through your instructions and everything ran well, I was able to copy the archive data to the slurmdbd container. I then accessed it using docker exec -it slurmdbd bash. From there, I tried to load the archive data using the command sacctmgr archive load file=/data/slurm_archive. However, this gave me the following error:

error: slurmdbd: Error with request.  
Problem loading archive file: Permission denied

I'm running as root so I didn't figure there would be any permission errors. Just wondering if there was some setup I was missing if you knew anything offhand. Thanks

giovtorres commented 1 year ago

Are you by chance running the container on a host with SELinux enabled? If so, you will want to mount the volume with the archive data with the :z flag.

mdefende commented 1 year ago

I do not have any SELinux packages installed on the host so I don't think that would be a problem. Just to give a bit more explanation, I ran the following steps:

  1. Cloned the repo and built the containers using docker build --build-arg SLURM_TAG="slurm-18-08-6-1" -t slurm-docker-cluster:18.08.6 .
  2. Ran IMAGE_TAG=18.08.6 docker-compose up -d
  3. Ran ./register_cluster.sh
  4. I copied the archive file to the container using docker cp slurm_archive slurmdbd:/data/slurm_archive.
  5. I accessed the slurmdbd container using docker exec -it slurmdbd bash. I can see the archive file was successfully transferred to the /data directory in the container
  6. I tried to load the archive file using sacctmgr archive load file=/data/slurm_archive which gave the error above

The host VM I'm running is Ubuntu 22.04 if that helps