oxwhirl / pymarl

Python Multi-Agent Reinforcement Learning framework
Apache License 2.0
1.87k stars 384 forks source link

run experiments using the Docker container FAILED cz $GPU is empty #155

Open wsz7525058 opened 1 year ago

wsz7525058 commented 1 year ago

problem

I try to run with docker bash run.sh $GPU python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z but get failed. Then I debug with run.sh. And I find it is because the $GPU is none then the run.sh split the python command into gpu.

can you guys tell me what value should I set to $GPU? I have 4 rtx3090 on it. shouild the vaule be 1~4?

detail

print in run.sh

$ bash run.sh $GPU python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z
Launching container named 'user_pymarl_GPU_python3_tarx' on GPU 'python3'
cmd: docker
name: user_pymarl_GPU_python3_tarx
gpu: python3
run.sh: line 18: id: ${id -u}: bad substitution
@2: src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

==========
== CUDA ==
==========

CUDA Version 11.6.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

/pymarl/src/main.py: line 1: import: command not found
/pymarl/src/main.py: line 2: import: command not found
/pymarl/src/main.py: line 3: import: command not found
from: can't read /var/mail/os.path
from: can't read /var/mail/copy
from: can't read /var/mail/sacred
from: can't read /var/mail/sacred.observers
from: can't read /var/mail/sacred.utils
/pymarl/src/main.py: line 9: import: command not found
/pymarl/src/main.py: line 10: import: command not found
from: can't read /var/mail/utils.logging
/pymarl/src/main.py: line 12: import: command not found
from: can't read /var/mail/run
/pymarl/src/main.py: line 16: SETTINGS[CAPTURE_MODE]: command not found
/pymarl/src/main.py: line 17: syntax error near unexpected token `('
/pymarl/src/main.py: line 17: `logger = get_logger()'

$ echo $GPU

$ echo $PWD
/home/user/platform/pymarl

run.sh file debug print

#!/bin/bash
HASH=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 4 | head -n 1)
GPU=$1
name=${USER}_pymarl_GPU_${GPU}_${HASH}

echo "Launching container named '${name}' on GPU '${GPU}'"
# Launches a docker container using our image, and runs the provided command

if hash nvidia-docker 2>/dev/null; then
  cmd=nvidia-docker
else
  cmd=docker
fi

echo "cmd: ${cmd}"
echo "name: ${name}"
echo "gpu: ${GPU}"
echo "id: ${id -u}"
echo "@2: ${@:2}"

NV_GPU="$GPU" ${cmd} run \
    --name $name \
    --user $(id -u):$(id -g) \
    -v `pwd`:/pymarl \
    -it pymarl:v0.2 \
    ${@:2}