xtreme-d / docker-slurm-cluster

Simple Slurm cluster in docker.
MIT License
9 stars 6 forks source link

[SOLVED] SBatch dont generate output file #4

Closed m0rfeo closed 2 years ago

m0rfeo commented 2 years ago

I do sbatch on a basic script

!/bin/bash

SBATCH --job-name=JOB_ID_NAME

SBATCH --ntasks=4

SBATCH --output=/test.out

echo a

Work appears as COMPLETED but can't find the output file anywhere

m0rfeo commented 2 years ago

I try with #SBATCH -o ../../../../../../test.out to force to put it on / directory and still the same problem

m0rfeo commented 2 years ago

Its could be because output of Docker is stdout? I dont think so but could be

m0rfeo commented 2 years ago

Please need some help, i pass one day trying to solve this issue

hackprime commented 2 years ago

Hi @kikegarcia28 ,

The problem is that you trying to write the output file to the path that is accessible for the current node only. The output file must be shared between all compute nodes, the / path of axc-headnode is not.

[root@axc-headnode axc]# sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute*     up 1-00:00:00      4   idle axc-compute-[01-04]

As you can see, only axc-compute-[01-04] are included under the slurm control as compute nodes.

If you want to write output to that path anyway, you may create a new partition that will contain the head node only. Or, include the head node to the existing "compute" partition.

For example:

PartitionName=Debug Nodes=axc-headnode Default=No MaxTime=00:30:00 State=UP OverSubscribe=Yes

In that way, you will see the /test.out file for sure.

m0rfeo commented 2 years ago

OMG, How i couldn't see before. I'm so thankful for your proyect and support!

m0rfeo commented 2 years ago

SOLVED, thanks you!