pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
https://pytorch.org/rl
MIT License
2.22k stars 292 forks source link

[BUG] IndexError Occurs with ObservationNorm in init_stats Function When from_pixels Set to True #1358

Open CFSD opened 1 year ago

CFSD commented 1 year ago

Describe the bug

When using the ObservationNorm method in the init_stats function of the torchrl library, an IndexError occurs with the message "Dimension out of range (expected to be in range of [-4, 3], but got -5)". This seems to be associated with the reduction dimension tuple.

To Reproduce

It's difficult to provide exact steps without additional information, but here's an approximation based on the provided traceback:

  1. Initialize an environment using make_parallel_env.
  2. Set up the ObservationNorm using init_stats function with from_pixels set to True.
  3. Start the training with the ppo.py script.

The relevant script sections are:

# In ppo.py
collector, state_dict = make_collector(cfg, policy=actor)

# In utils.py
init_stats(env, 3, env_cfg.from_pixels)

# In transforms.py
loc = loc.squeeze(r)

Expected behavior

The ObservationNorm should be initialized without any error. The init_stats function should correctly calculate the statistics from the samples for normalization, irrespective of the input dimensions.

Screenshots

N/A

System info

Additional context

The warnings from the Hydra library also suggest some outdated usage and configuration, including the absence of _self_ in the defaults list, invalid overriding of hydra/job_logging, and changes in working directory behavior.

Reason and Possible fixes

The issue seems to be stemming from this specific piece of code in transforms.py:

t.init_stats(
    n_samples_stats,
    cat_dim=-4,
    reduce_dim=tuple(
        -i for i in range(1, len(t.parent.batch_size) + 5)
    ),
    keep_dims=(-1, -2, -3),
)

The reduction dimension tuple appears to have a value that is out of the expected range. However, without knowing the exact dimensions of t.parent.batch_size, it's hard to propose a precise fix.

Checklist

vmoens commented 1 year ago

Hey!

Thanks for reporting.

I think I see the issue. I feel that the fix should include some check that we're not keeping any dim from the env batch size in the statistics (ie we can only normalize over feature dimensions, not batch dimensions). Alternatively, we could normalize over all the non-batch dimensions of the tensordict that is read by init_stats since by convention all of its dims are non-feature dimensions. I agree that there is some improvement that can be done for the saving and loading of stats in ObservationNorm. I'll propose a fix in the upcoming days and ping you with it!

CFSD commented 1 year ago

Thank you for your quick and detailed response. I'm looking forward to seeing your proposed fix for this issue.