btx0424 / OmniDrones

https://omnidrones.readthedocs.io/
MIT License
125 stars 23 forks source link

Error when run Formation task #75

Open Bruce-Si opened 6 days ago

Bruce-Si commented 6 days ago

Hello, when I run Formation task with algo=mappo, I got: mappo error

When I use algo=ppo, I got: ppo error

Any help, tks!!!

BIGREDZYC commented 2 days ago

I also encountered the second problem. I found that, because vmap was used when calling cost_formation_hausdorff, and the cost_formation_hausdorff function itself used the @torch.vmap decorator, the inputs were being vectorized twice. This ultimately led to the p having an incorrect shape. I was able to resolve the issue by removing the decorator. You can try as well.

Bruce-Si commented 2 days ago

I also encountered the second problem. I found that, because vmap was used when calling cost_formation_hausdorff, and the cost_formation_hausdorff function itself used the @torch.vmap decorator, the inputs were being vectorized twice. This ultimately led to the p having an incorrect shape. I was able to resolve the issue by removing the decorator. You can try as well.

Tks. I also removed the decorator and it worked, but the algo ppo could't convergence. Then I modified the mappo_new.py and the mappo worked.

luoming3 commented 3 hours ago

I also encountered the second problem. I found that, because vmap was used when calling cost_formation_hausdorff, and the cost_formation_hausdorff function itself used the @torch.vmap decorator, the inputs were being vectorized twice. This ultimately led to the p having an incorrect shape. I was able to resolve the issue by removing the decorator. You can try as well.

Tks. I also removed the decorator and it worked, but the algo ppo could't convergence. Then I modified the mappo_new.py and the mappo worked.

Hi, When I executed the task python train.py task=Transport/TransportTrack algo=mappo headless=true with share_actor=True cfg , I got the following error:

[2024-07-08 10:56:02,847][root][INFO] - Default parameters:
Mass: [[0.6800000071525574], [0.6800000071525574], [0.6800000071525574], [0.6800000071525574]]
Inertia: [[0.007000000681728125, 0.007000000681728125, 0.011999999172985554], [0.007000000681728125, 0.007000000681728125, 0.011999999172985554], [0.007000000681728125, 0.007000000681728125, 0.011999999172985554], [0.007000000681728125, 0.007000000681728125, 0.011999999172985554]]
Thrust2Weight: [[0.8999204039573669, 0.8999204039573669, 0.8999204039573669, 0.8999204039573669], [0.8999204039573669, 0.8999204039573669, 0.8999204039573669, 0.8999204039573669], [0.8999204039573669, 0.8999204039573669, 0.8999204039573669, 0.8999204039573669], [0.8999204039573669, 0.8999204039573669, 0.8999204039573669, 0.8999204039573669]]
Force2Moment: [[62.499996185302734, 62.499996185302734, 62.499996185302734, 62.499996185302734], [62.499996185302734, 62.499996185302734, 62.499996185302734, 62.499996185302734], [62.499996185302734, 62.499996185302734, 62.499996185302734, 62.499996185302734], [62.499996185302734, 62.499996185302734, 62.499996185302734, 62.499996185302734]]
/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/lazy.py:181: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment.
  warnings.warn('Lazy modules are a new feature under heavy development '
Error executing job with overrides: ['task=Transport/TransportTrack', 'algo=mappo', 'headless=true']
RuntimeError: TensorDictModule failed with operation
    Sequential(
      (0): Sequential(
        (0): Linear(in_features=4, out_features=512, bias=True)
        (1): Mish()
        (2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
        (3): LazyLinear(in_features=0, out_features=256, bias=True)
        (4): Mish()
        (5): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
      )
      (1): LazyLinear(in_features=0, out_features=4, bias=True)
      (2): Rearrange('... -> ... 1')
    )
    in_keys=[('agents', 'observation_central')]
    out_keys=['state_value'].

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/git/OmniDrones/scripts/train.py", line 103, in main
    policy = ALGOS[cfg.algo.name.lower()](
  File "/home/user/git/OmniDrones/omni_drones/learning/mappo_new.py", line 180, in __init__
    self.critic(fake_input)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/nn/common.py", line 289, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/_contextlib.py", line 126, in decorate_context
    return func(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/nn/utils.py", line 261, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/nn/common.py", line 1224, in forward
    raise err from RuntimeError(
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/nn/common.py", line 1198, in forward
    raise err
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/nn/common.py", line 1184, in forward
    tensors = self._call_module(tensors, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/tensordict/nn/common.py", line 1141, in _call_module
    out = self.module(*tensors, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1582, in _call_impl
    result = forward_call(*args, **kwargs)
  File "/home/user/miniconda3/envs/sim/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 116, in forward
    return F.linear(input, self.weight, self.bias)
TypeError: Multiple dispatch failed for 'torch.nn.linear'; all __torch_function__ handlers returned NotImplemented:

  - tensor subclass <class 'tensordict._td.TensorDict'>

For more information, try re-running with TORCH_LOGS=not_implemented

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

I installed the nightly version of tensordict and torchrl. How can I modify it to make mappo run normally?

Can you help me solve it? It would be greatly appreciated.