[RLlib] IMPALA crashes if using an LSTM and num_sgd_iter > 1.

sven1977 commented 4 years ago

IMPALA (TensorFlow) crashes with IndexError: index 138 is out of bounds for axis 0 with size 132 if using an RNN and num_sgd_iter > 1.

config:

num_sgd_iter: 2
model:
   - use_lstm: true

Repro Script:

import unittest

import ray
import ray.rllib.agents.impala as impala
from ray.rllib.utils.framework import try_import_tf
from ray.rllib.utils.test_utils import check_compute_single_action, \
    framework_iterator

tf = try_import_tf()

class TestIMPALA(unittest.TestCase):
    @classmethod
    def setUpClass(cls):
        ray.init(local_mode=True)

    @classmethod
    def tearDownClass(cls):
        ray.shutdown()

    def test_impala_compilation(self):
        """Test whether an ImpalaTrainer can be built with both frameworks."""
        config = impala.DEFAULT_CONFIG.copy()
        num_iterations = 1

        for _ in framework_iterator(config, frameworks=("tf", "torch")):
            local_cfg = config.copy()
            #for env in ["Pendulum-v0", "CartPole-v0"]:
            env = "Pendulum-v0"
            print("Env={}".format(env))

            # Test w/ LSTM.
            print("w/ LSTM")
            local_cfg["model"]["use_lstm"] = True
            local_cfg["num_aggregation_workers"] = 2
            local_cfg["num_sgd_iter"] = 2
            trainer = impala.ImpalaTrainer(config=local_cfg, env=env)
            for i in range(num_iterations):
                print(trainer.train())
            check_compute_single_action(trainer, include_state=True)
            trainer.stop()

if __name__ == "__main__":
    import pytest
    import sys
    sys.exit(pytest.main(["-v", __file__]))

Ray version and other system information (Python version, TensorFlow version, OS):

Reproduction (REQUIRED)

Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):

If we cannot run your script, we cannot fix your issue.

[x] I have verified my script runs in a clean environment and reproduces the issue.
[x] I have verified the issue also occurs with the latest wheels.

idorozenberg commented 3 years ago

Also happens in APPO and I assume in all async algorithms. Any progress?

iamhatesz commented 3 years ago

The same for the PyTorch version of IMPALA.

RocketRider commented 3 years ago

I ran into this issue again. Is there any progress? Could you add at some clear error that it is not supported?

ray-project / ray

[RLlib] IMPALA crashes if using an LSTM and num_sgd_iter > 1. #8616

Reproduction (REQUIRED)