Nothing showing in tensorboard, no training, no directory in /tmp/pong

KeirSimmons commented 7 years ago

Ubuntu 16.04

When I run any of the example code in the README, nothing really happens. The /tmp/pong directory is created with a cmd.sh file, but nothing else. Tensorboard also shows nothing (as there are no files in /tmp/pong). I get the following output in the terminal:

Executing the following commands:
mkdir -p /tmp/pong
echo /home/viral/miniconda3/envs/universe-starter-agent/bin/python train.py --num-workers 2 --env-id PongDeterministic-v3 --log-dir /tmp/pong --visualise > /tmp/pong/cmd.sh
kill $( lsof -i:12345 -t ) > /dev/null 2>&1
kill $( lsof -i:12222-12224 -t ) > /dev/null 2>&1
tmux kill-session -t a3c
tmux new-session -s a3c -n ps -d bash
tmux new-window -t a3c -n w-0 bash
tmux new-window -t a3c -n w-1 bash
tmux new-window -t a3c -n tb bash
tmux new-window -t a3c -n htop bash
sleep 1
tmux send-keys -t a3c:ps 'CUDA_VISIBLE_DEVICES= /home/viral/miniconda3/envs/universe-starter-agent/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v3 --num-workers 2 --visualise --job-name ps' Enter
tmux send-keys -t a3c:w-0 'CUDA_VISIBLE_DEVICES= /home/viral/miniconda3/envs/universe-starter-agent/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v3 --num-workers 2 --visualise --job-name worker --task 0 --remotes 1' Enter
tmux send-keys -t a3c:w-1 'CUDA_VISIBLE_DEVICES= /home/viral/miniconda3/envs/universe-starter-agent/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v3 --num-workers 2 --visualise --job-name worker --task 1 --remotes 1' Enter
tmux send-keys -t a3c:tb 'tensorboard --logdir /tmp/pong --port 12345' Enter
tmux send-keys -t a3c:htop htop Enter

Use `tmux attach -t a3c` to watch process output
Use `tmux kill-session -t a3c` to kill the job
Point your browser to http://localhost:12345 to see Tensorboard

Please advise :)

rahulpalamuttam commented 7 years ago

@KeirSimmons it could be that your workers died. When you attach the a3c screen and toggle over to one of the worker screens do you see any python errors?

ktlichkid commented 7 years ago

I've got the same issue.

`Executing the following commands: mkdir -p /tmp/pong echo /home/lich/anaconda2/envs/universe-starter-agent/bin/python train.py --num-workers 2 --env-id PongDeterministic-v3 --log-dir /tmp/pong > /tmp/pong/cmd.sh kill $( lsof -i:12345 -t ) > /dev/null 2>&1 kill $( lsof -i:12222-12224 -t ) > /dev/null 2>&1 tmux kill-session -t a3c tmux new-session -s a3c -n ps -d bash tmux new-window -t a3c -n w-0 bash tmux new-window -t a3c -n w-1 bash tmux new-window -t a3c -n tb bash tmux new-window -t a3c -n htop bash sleep 1 tmux send-keys -t a3c:ps 'CUDA_VISIBLE_DEVICES= /home/lich/anaconda2/envs/universe-starter-agent/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v3 --num-workers 2 --job-name ps' Enter tmux send-keys -t a3c:w-0 'CUDA_VISIBLE_DEVICES= /home/lich/anaconda2/envs/universe-starter-agent/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v3 --num-workers 2 --job-name worker --task 0 --remotes 1' Enter tmux send-keys -t a3c:w-1 'CUDA_VISIBLE_DEVICES= /home/lich/anaconda2/envs/universe-starter-agent/bin/python worker.py --log-dir /tmp/pong --env-id PongDeterministic-v3 --num-workers 2 --job-name worker --task 1 --remotes 1' Enter tmux send-keys -t a3c:tb 'tensorboard --logdir /tmp/pong --port 12345' Enter tmux send-keys -t a3c:htop htop Enter

no server running on /tmp/tmux-1000/default Use tmux attach -t a3c to watch process output Use tmux kill-session -t a3c to kill the job Point your browser to http://localhost:12345 to see Tensorboard `

Did you fix the problem? Thank you

rahulpalamuttam commented 7 years ago

Can you show me the output of each of the 4 tmux screens? Run "tmux attach -t a3c" from terminal. Then ctrl-b to toggle between the tmux screens.

will be htop
will be a worker
will be a worker
will be tensorboard i believe

We want to look at the output of the worker screens.

ktlichkid commented 7 years ago

20170619103755

Thanks for replying. This is the tmux screen. However, nothing happened when I pressed ctrl-b. Seems the workers were not working at all.

Butsuri commented 7 years ago

After 'ctrl-b', press the number corresponding to the screen (in the green bar at the bottom .. '1' for worker-0, '2' for worker-1, etc ..)

yuchen8807 commented 7 years ago

I've got the same issue. Do you have find a solution to fix the problem?

rahulpalamuttam commented 7 years ago

Can you show the output of each of the tmux screens? Run "tmux attach -t a3c" from terminal. Then ctrl-b # to toggle between the tmux screens (where # is 0, 1, 2, 3 etc.).

We want to look at the output of the worker screens.

PDillis commented 7 years ago

If this is still helpful, I was having the same issues until I followed the advice of @rahulpalamuttam and checked the output from the tmux screens. Apparently, there is no longer an environment PongDeterministic-v3. Instead, you must use PongDeterministic-v4, or PongDeterministic-v0, and you are good to go.

EMCP commented 7 years ago

update to README for this is in a PR already https://github.com/openai/universe-starter-agent/pull/105

openai / universe-starter-agent

Nothing showing in tensorboard, no training, no directory in /tmp/pong #100