Ericonaldo / visual_wholebody

Train a loco-manipulation dog with RL
https://wholebody-b1.github.io/
Other
101 stars 8 forks source link

Visualization When Playing High-Level Teacher Policy #3

Open Loskiz opened 3 months ago

Loskiz commented 3 months ago

Hello,

When I'm trying to play the trained high-level teacher policy, I can see the process running in terminal, but the Isaac Gym window is always a black screen. I have trained and loaded the low-level policy, and playing the low-level policy shows a visualization as normal.

In the b1z1_pickmulti.yaml file under high-level/data/cfg, I toggled enableDebugVis to True, but the windows is still black. When I toggled both enableDebugVis and enableCamera to True, it produces the following error message:

Traceback (most recent call last): File "play_multistate.py", line 8, in trainer.eval() File "/home/letianz/Projects/visual_wholebody/third_party/skrl/skrl/trainers/torch/sequential.py", line 148, in eval self.single_agent_eval() File "/home/letianz/Projects/visual_wholebody/third_party/skrl/skrl/trainers/torch/base.py", line 243, in single_agent_eval actions = self.agents.act(states, timestep=timestep, timesteps=self.timesteps)[0] File "/home/letianz/Projects/visual_wholebody/third_party/skrl/skrl/agents/torch/ppo/ppo.py", line 217, in act actions, log_prob, outputs = self.policy.act({"states": self._state_preprocessor(states)}, role="policy") File "/home/letianz/anaconda3/envs/b1z1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/letianz/anaconda3/envs/b1z1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, **kwargs) File "/home/letianz/Projects/visual_wholebody/third_party/skrl/skrl/resources/preprocessors/torch/running_standard_scaler.py", line 176, in forward return self._compute(x, train, inverse) File "/home/letianz/Projects/visual_wholebody/third_party/skrl/skrl/resources/preprocessors/torch/running_standard_scaler.py", line 133, in _compute return torch.clamp((x - self.running_mean.float()) / (torch.sqrt(self.running_variance.float()) + self.epsilon), TypeError: unsupported operand type(s) for -: 'dict' and 'Tensor'

I am wondering if the black screen when playing the high-level teacher policy is a expected behavior, and if so, is there a way to enable the visualization? Thank you.

Chu4nQ1n commented 3 months ago

Hi Loskiz, I succeed in playing trained high-level teacher policy. Did you add the same arguments after _python playmultistate.py command as training? Btw, how is the performance of your trained teacher policy?

Loskiz commented 3 months ago

This is how I trained the policy:

python train_multistate.py --rl_device "cuda:0" --sim_device "cuda:0" --timesteps 500 --headless --task B1Z1PickMulti --experiment_dir b1-pick-multi-teacher-test --wandb --wandb_project "b1-pick-multi-teacher" --wandb_name "some descriptions" --roboinfo --observe_gait_commands --small_value_set_zero --rand_control --stop_pick

This is how I played the trained policy:

python play_multistate.py --task B1Z1PickMulti --checkpoint "/home/letianz/Projects/visual_wholebody/high-level/b1-pick-multi-teacher-test/some descriptions/checkpoints/agent_500.pt" --roboinfo --observe_gait_commands --small_value_set_zero --rand_control --stop_pick

I have not trained the policy with that much timesteps yet, since I wanted to test out the visualization first.

Chu4nQ1n commented 3 months ago

It looks good on your commands, maybe something wrong happened in your environment. I set up the environment as the instruction on 2x Ubuntu 20.04 PCs and both work fine. Maybe you can wait for a reply from the repo developer or try reinstalling the environment.

Loskiz commented 3 months ago

@Chu4nQ1n out of curiosity, did you have to toggle enableDebugVis and enableCamera to True to enable the visualization? Also, I am using a dual GPU system, but I'm not sure it relates to the issue.

Chu4nQ1n commented 3 months ago

@Chu4nQ1n out of curiosity, did you have to toggle enableDebugVis and enableCamera to True to enable the visualization? Also, I am using a dual GPU system, but I'm not sure it relates to the issue.

No, I didn’t change anything. Regarding the GPU, I’m working on single 4070.

hatimwen commented 2 months ago

Hi Loskiz, I succeed in playing trained high-level teacher policy. Did you add the same arguments after _python playmultistate.py command as training? Btw, how is the performance of your trained teacher policy?

Hi, @Chu4nQ1n , I’m trying to reproduce the results of the teacher policy. How has your trained teacher policy performed? Despite numerous attempts, mine still performs poorly. 😢

Chu4nQ1n commented 2 months ago

Hi Loskiz, I succeed in playing trained high-level teacher policy. Did you add the same arguments after _python playmultistate.py command as training? Btw, how is the performance of your trained teacher policy?

Hi, @Chu4nQ1n , I’m trying to reproduce the results of the teacher policy. How has your trained teacher policy performed? Despite numerous attempts, mine still performs poorly. 😢

Same. I also tried to remove some objects during training. In this case, the success rate increased steady, but 60,000 timesteps still did not converge to the optimum.

Ericonaldo commented 2 months ago

Hi all, for the display problem, have you checked if you can successfully run and render isaacgym's official example?

For the high-level learning problem, I just have a quick check, the high-level part is clean, and the problem comes from some recent change of the low-level part, as I can learn well with my previous low-level model. I am working on that, and will let you know soon once I fix it.

Loskiz commented 2 months ago

Hi all, for the display problem, have you checked if you can successfully run and render isaacgym's official example?

Yes. I have run several Issac Gym's examples with visualization, and the low-level visualization, too, with no problem. It seems that not being able to play the high level visualization is the only issue I am having right now.

I am using a setup with dual RTX 4090 and a Threadripper CPU. I did run into some problems with the way the dual GPUs interact with the training code on the low-level. Do you think that might be the issue here?

hatimwen commented 2 months ago

Hi all, for the display problem, have you checked if you can successfully run and render isaacgym's official example?

Yes. I have run several Issac Gym's examples with visualization, and the low-level visualization, too, with no problem. It seems that not being able to play the high level visualization is the only issue I am having right now.

I am using a setup with dual RTX 4090 and a Threadripper CPU. I did run into some problems with the way the dual GPUs interact with the training code on the low-level. Do you think that might be the issue here?

Hi @Loskiz , have you tried to set CUDA_VISIBLE_DEVICES=0 when you use a multi-GPU device? My multi-GPU device works well when I play the trained high-level policy, using the command,

cd high-level

CUDA_VISIBLE_DEVICES=0 \
python play_multistate.py \
--checkpoint /path/to/agent_60000.pt \
--rl_device "cuda:0" \
--sim_device "cuda:0" \
--timesteps 60000 \
--task B1Z1PickMulti \
--experiment_dir b1-pick-multi-teacher \
--roboinfo \
--observe_gait_commands \
--small_value_set_zero \
--rand_control \
--stop_pick \
--graphics_device_id 0

cd ..

Regarding the line of code responsible for creating the simulation environment, https://github.com/Ericonaldo/visual_wholebody/blob/7243de41b2a8d5659855d2178b519e232a27bf23/high-level/envs/b1z1_base.py#L407 I suggest a slight modification that could resolve a bug encountered during the training of high-level policy on multi-GPU devices.

        self.sim = super().create_sim(self.sim_id, self.graphics_device_id, self.physics_engine, self.sim_params)

Btw have you succeeded in reproducing the results of high-level policy yet?

Ericonaldo commented 2 months ago

@hatimwen Refer to my answer in this issue.