sbatch: error: No account specified, defaulting to: cse
sbatch: error: No partition specified, defaulting to: compute
sbatch: error: Batch job submission failed: Invalid qos specification
Since this didn't work
So what I did is that I login in to the cluster and mannually ran
sh ./docker/cluster/submit_job.sh ${CLUSTER_ORBIT_DIR} --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
Job submission succeeded, but the output shows
FATAL: container creation failed: mount hook function failure: mount /var/apptainer/mnt/session/mmfs1->/mmfs1 error: while mounting /var/apptainer/mnt/session/mmfs1: destination /mmfs1 doesn't exist in container
Steps to reproduce
following the cluster guide with a clean orbit install.
sbatch: error: No account specified, defaulting to: cse
sbatch: error: No partition specified, defaulting to: compute
sbatch: error: Batch job submission failed: Invalid qos specification
Or Running
sh ./docker/cluster/submit_job.sh ${CLUSTER_ORBIT_DIR} --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
returned
(run_singularity.py): Called on compute node with arguments --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
WARNING: nv files may not be bound with --writable
WARNING: By using --writable, Apptainer can't create /mmfs1 destination automatically without overlay or underlay
FATAL: container creation failed: mount hook function failure: mount /var/apptainer/mnt/session/mmfs1->/mmfs1 error: while mounting /var/apptainer/mnt/session/mmfs1: destination /mmfs1 doesn't exist in container
This looks like an Apptainer and Docker version issue. Can you try to use apptainer version 1.2.5-1.el7 and docker version 24.0.7 on the system where you build the singularity file?
I started with a clean orbit pulled from this repository followed documentation's guide downloaded
Docker version 24.0.2 Docker Compose version v2.18.1 apptainer version 1.3.0
Everything succeed until running
Returned:
Since this didn't work So what I did is that I login in to the cluster and mannually ran
Job submission succeeded, but the output shows
Steps to reproduce
following the cluster guide with a clean orbit install.
Running
Returned:
Or Running
returned
-->
System Info
Describe the characteristic of your environment:
ACCEPT_EULA=Y
ISAACSIM_VERSION=2023.1.1
DOCKER_ISAACSIM_PATH=/isaac-sim
DOCKER_USER_HOME=/root
CLUSTER_ISAAC_SIM_CACHE_DIR=/path/to/docker-isaac-sim
CLUSTER_ORBIT_DIR=/path/to/orbit
CLUSTER_LOGIN=...........edu
CLUSTER_SIF_PATH=/path/to/sif_path/
CLUSTER_PYTHON_EXECUTABLE=source/standalone/workflows/rsl_rl/train.py
Checklist
Acceptance Criteria
Add the criteria for which this task is considered done. If not known at issue creation time, you can add this once the issue is assigned.