islamelnabarawy / sc2agents

PySC2 Reinforcement Learning Agents
4 stars 1 forks source link

Bug -- can't run with sc2gym #1

Open skaematik opened 6 years ago

skaematik commented 6 years ago

I ran python train_a2c.py but ran into an error. The trace is provided below.

I also have a Google Colab notebook showing this error here (it is using Tensorflow 1.4): https://colab.research.google.com/drive/1vQc0vbO0waUa2QCmMLqcPVhpJ4ElOtRR

Traceback (most recent call last): File "./sc2agents/sc2agents/train_a2c.py", line 53, in main() File "./sc2agents/sc2agents/train_a2c.py", line 49, in main policy='cnn', lrschedule='constant', num_cpu=4) File "./sc2agents/sc2agents/train_a2c.py", line 39, in train learn(policy_fn, env, seed, total_timesteps=int(num_timesteps * 1.1), lrschedule=lrschedule) File "/usr/local/lib/python3.6/dist-packages/baselines/a2c/a2c.py", line 156, in learn max_grad_norm=max_grad_norm, lr=lr, alpha=alpha, epsilon=epsilon, total_timesteps=total_timesteps, lrschedule=lrschedule) File "/usr/local/lib/python3.6/dist-packages/baselines/a2c/a2c.py", line 35, in init step_model = policy(sess, ob_space, ac_space, nenvs, 1, reuse=False) File "/usr/local/lib/python3.6/dist-packages/baselines/a2c/policies.py", line 108, in init h = nature_cnn(X) File "/usr/local/lib/python3.6/dist-packages/baselines/a2c/policies.py", line 12, in nature_cnn h = activ(conv(scaled_images, 'c1', nf=32, rf=8, stride=4, init_scale=np.sqrt(2))) File "/usr/local/lib/python3.6/dist-packages/baselines/a2c/utils.py", line 59, in conv return b + tf.nn.conv2d(x, w, strides=strides, padding=pad, data_format=data_format) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 956, in conv2d data_format=data_format, dilations=dilations, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1734, in init control_input_ops) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1570, in _create_c_op raise ValueError(str(e)) ValueError: Negative dimension size caused by subtracting 8 from 1 for 'model/c1/Conv2D' (op: 'Conv2D') with input shapes: [4,1,64,64], [8,8,64,32].

islamelnabarawy commented 6 years ago

Issue confirmed. As far as I can tell, this is likely due to the baselines code getting updated.

The train_dqn.py file still runs as far as I could tell. I have been planning on replacing the baselines-based implementation with a self-contained one for a while now, so that's what I will do instead of trying to hunt down the issue. I believe that will provide more of an educational value to anyone reading the code as well.

For a similar standalone and fully-functional implementation of A3C for SC2 minigames, this repo provides a good example. I have managed to run it and reproduce their results.