Closed forhonourlx closed 1 year ago
Hi @sven1977 According to https://github.com/ray-project/ray/issues/8616 IMPALA crashes if using an LSTM and num_sgd_iter > 1 When using IMPALA in attention_net, if num_sgd_iter > 1, it crashes. And PPO will not. initial_states seem to be chopped into [BATCH_SIZE / MAX_SEQ_LEN, ...], how does num_sgd_iter effect initial_states?
(pid=7052) 2020-09-06 05:09:19,537 INFO rnn_sequencing.py:107 -- Padded input for RNN:
(pid=7052)
(pid=7052) { 'features': [ np.ndarray((1000,), dtype=float64, min=0.0, max=1.0, mean=0.542),
(pid=7052) np.ndarray((1000,), dtype=float64, min=-1.0, max=1.0, mean=0.028),
(pid=7052) np.ndarray((1000, 2), dtype=float64, min=0.0, max=1.0, mean=0.5),
(pid=7052) np.ndarray((1000, 2), dtype=float64, min=-0.206, max=0.521, mean=0.112),
(pid=7052) np.ndarray((1000,), dtype=float64, min=-0.898, max=-0.523, mean=-0.684),
(pid=7052) np.ndarray((1000,), dtype=float64, min=0.0, max=1.0, mean=0.554),
(pid=7052) np.ndarray((1000,), dtype=float64, min=0.0, max=0.0, mean=0.0),
(pid=7052) np.ndarray((1000, 2), dtype=float64, min=0.0, max=1.0, mean=0.5),
(pid=7052) np.ndarray((1000,), dtype=float64, min=0.0, max=1.0, mean=0.542),
(pid=7052) np.ndarray((1000,), dtype=float64, min=-1.0, max=1.0, mean=0.028),
(pid=7052) np.ndarray((1000,), dtype=float64, min=-1.0, max=1.0, mean=0.026)],
(pid=7052) 'initial_states': [ np.ndarray((20, 50, 2), dtype=float32, min=0.0, max=0.0, mean=0.0),
(pid=7052) np.ndarray((20, 50, 64), dtype=float32, min=0.0, max=0.0, mean=0.0)],
(pid=7052) 'max_seq_len': 50,
(pid=7052) 'seq_lens': np.ndarray((20,), dtype=int32, min=50.0, max=50.0, mean=50.0)}
(pid=7052)
(pid=7052) Exception in thread Thread-1:
(pid=7052) Traceback (most recent call last):
(pid=7052) File "C:\vnstudio\lib\threading.py", line 917, in _bootstrap_inner
(pid=7052) self.run()
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\execution\learner_thread.py", line 65, in run
(pid=7052) self.step()
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\execution\learner_thread.py", line 72, in step
(pid=7052) fetches = self.local_worker.learn_on_batch(batch)
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\evaluation\rollout_worker.py", line 757, in learn_on_batch
(pid=7052) .learn_on_batch(samples)
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\policy\tf_policy.py", line 385, in learn_on_batch
(pid=7052) fetches = self._build_learn_on_batch(builder, postprocessed_batch)
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\policy\tf_policy.py", line 759, in _build_learn_on_batch
(pid=7052) self._get_loss_inputs_dict(postprocessed_batch, shuffle=False))
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\policy\tf_policy.py", line 796, in _get_loss_inputs_dict
(pid=7052) feature_keys=[k for k, v in self._loss_inputs])
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\policy\rnn_sequencing.py", line 94, in pad_batch_to_sequences_of_same_size
(pid=7052) shuffle=shuffle)
(pid=7052) File "C:\vnstudio\lib\site-packages\ray\rllib\policy\rnn_sequencing.py", line 254, in chop_into_sequences
(pid=7052) s_init.append(s[i])
(pid=7052) IndexError: index 50 is out of bounds for axis 0 with size 20
Hi, ray team! I am running examples/attention_net.py . @sven1977 , @ericl I input the python /examples /attention_net.py . But,
hello,@ericl @sven1977 I read the error.txt.
It means You must feed a value for placeholder tensor 'default_policy.
I should write the default policy,Right?
What is the problem?
Hi Ray Team, I am running
examples/models/attention_net.py
with IMPALA, getting the following exceptions: states seem to be chopped into [BATCH_SIZE / MAX_SEQ_LEN, ...], is there any way to fit IMPALA? Could somebody give me a hand? Thanks in advance. Ray version and other system information (Ray master, Python 3.7, TensorFlow 2.3, Ubuntu/Windows):Reproduction (REQUIRED)
rllib/examples/attention_net.py
line17 is changed to:parser.add_argument("--run", type=str, default="IMPALA")