aws-samples / aws-parallelcluster-post-install-scripts

Scripts to customize AWS ParallelCluster
MIT No Attribution
16 stars 9 forks source link

Pyxis runtime path cannot be on /fsx #20

Open verdimrc opened 3 months ago

verdimrc commented 3 months ago

Pyxis runtime path cannot be /fsx, otherwise error to run Docker image (directly) on multiple nodes.

# NOTE: below works fine for -N1.
$ srun -N2 --container-image=alpine grep PRETTY /etc/os-release
...
slurmstepd: error: pyxis:     Can't find a SQUASHFS superblock on /fsx/pyxis/1000/385.0.squashfs
slurmstepd: error: pyxis:     Wrong filesystem or filesystem is corrupted!
slurmstepd: error: pyxis:     Failed to read existing filesystem - will not overwrite - ABORTING!
slurmstepd: error: pyxis:     To force Mksquashfs to write to this block device or file use -noappend
...
srun: error: p4de-st-p4de-1: task 0: Exited with exit code 1
...
slurmstepd: error: pyxis:     [ERROR] No such file or directory: /fsx/pyxis/1000/385.0.squashfs
...
srun: error: p4de-st-p4de-2: task 1: Exited with exit code 1