Open HaozhengLi opened 5 years ago
I change the num_env
in
train(args.env, num_timesteps=args.num_timesteps, seed=args.seed, policy=args.policy, lrschedule=args.lrschedule, sil_update=args.sil_update, sil_beta=args.sil_beta, num_env=15)
But got another error:
Logging to /tmp/a2c
2018-12-25 14:55:20.343377: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
WARNING:tensorflow:From e:\output\python_output\hardrlwithyoutube\self-imitation-learning-master\baselines\common\distributions.py:148: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be re
moved in a future version.
Instructions for updating:
Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.
See `tf.nn.softmax_cross_entropy_with_logits_v2`.
WARNING:tensorflow:From e:\output\python_output\hardrlwithyoutube\self-imitation-learning-master\baselines\a2c\utils.py:13: calling reduce_max (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a
future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
WARNING:tensorflow:From e:\output\python_output\hardrlwithyoutube\self-imitation-learning-master\baselines\a2c\utils.py:15: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a
future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
Traceback (most recent call last):
File "baselines/a2c/run_atari_sil.py", line 38, in <module>
main()
File "baselines/a2c/run_atari_sil.py", line 35, in main
num_env=15)
File "baselines/a2c/run_atari_sil.py", line 20, in train
sil_update=sil_update, sil_beta=sil_beta)
File "e:\output\python_output\hardrlwithyoutube\self-imitation-learning-master\baselines\a2c\a2c_sil.py", line 161, in learn
max_grad_norm=max_grad_norm, lr=lr, alpha=alpha, epsilon=epsilon, total_timesteps=total_timesteps, lrschedule=lrschedule, sil_update=sil_update, sil_beta=sil_beta)
File "e:\output\python_output\hardrlwithyoutube\self-imitation-learning-master\baselines\a2c\a2c_sil.py", line 69, in __init__
sil_model.entropy, sil_model.value, sil_model.neg_log_prob,
AttributeError: 'LstmPolicy' object has no attribute 'entropy'
It seems that the policy 'lstm' can not work because the lack of entropy
and so on.
What can I do to fix this? Thank you very much!
Hello.
I firstly change the policy in by:
parser.add_argument('--policy', help='Policy architecture', choices=['cnn', 'lstm', 'lnlstm'], default='lstm')
Then I run A2C+SIL on Atari games:
python baselines/a2c/run_atari_sil.py --env BreakoutNoFrameskip-v4
I got error:
What can I do to fix this? Thank you very much!