pat-coady / trpo

Trust Region Policy Optimization with TensorFlow and OpenAI Gym
https://learningai.io/projects/2017/07/28/ai-gym-workout.html
MIT License
360 stars 106 forks source link

Can the `Data cardinality is ambiguous error` in Tensorflow 2.4 or 2.5 be solved as follows? #39

Open wezardlza opened 3 years ago

wezardlza commented 3 years ago

Hi, thanks very much for your work. I use docker to build an environment to learn your work. When I use FROM tensorflow/tensorflow:2.3.3-gpu-jupyter to create a container, and test the examples

python train.py InvertedPendulumBulletEnv-v0
python train.py InvertedDoublePendulumBulletEnv-v0 -n 5000
python train.py HalfCheetahBulletEnv-v0 -n 5000 -b 5

all the tests passed. But when I use newer images, for instance, FROM tensorflow/tensorflow:2.4.2-gpu-jupyter, I got the ValueError: Data cardinality is ambiguous error as presented below.

$ python train.py InvertedPendulumBulletEnv-v0
['/home/wezardlza/workspace/trpo', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/home/wezardlza/.local/lib/python3.6/site-packages', '/usr/local/lib/python3.6/dist-packages', '/usr/lib/python3/dist-packages', '/home/wezardlza/workspace']
pybullet build time: Jun 22 2021 23:31:53
2021-06-22 23:42:07.575098: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
/home/wezardlza/.local/lib/python3.6/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
  warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
Value Params -- h1: 60, h2: 17, h3: 5, lr: 0.00243
2021-06-22 23:42:08.572333: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-06-22 23:42:08.572860: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-06-22 23:42:08.603929: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.604218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.815GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-06-22 23:42:08.604237: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-06-22 23:42:08.605660: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-06-22 23:42:08.605710: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-06-22 23:42:08.606300: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-06-22 23:42:08.606448: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-06-22 23:42:08.608003: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-06-22 23:42:08.608401: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-06-22 23:42:08.608521: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-06-22 23:42:08.608599: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.608890: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.609111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-06-22 23:42:08.609301: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-22 23:42:08.609478: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-06-22 23:42:08.609563: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.609800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.815GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-06-22 23:42:08.609823: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-06-22 23:42:08.609840: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-06-22 23:42:08.609850: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-06-22 23:42:08.609860: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-06-22 23:42:08.609870: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-06-22 23:42:08.609880: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-06-22 23:42:08.609891: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-06-22 23:42:08.609901: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-06-22 23:42:08.609946: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.610192: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.610404: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-06-22 23:42:08.610424: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-06-22 23:42:08.934643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-22 23:42:08.934667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2021-06-22 23:42:08.934672: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2021-06-22 23:42:08.934802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.935063: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.935289: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-22 23:42:08.935494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6638 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
Policy Params -- h1: 60, h2: 24, h3: 10, lr: 0.000184, logvar_speed: 2
argv[0]=
argv[0]=
2021-06-22 23:42:09.103904: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-06-22 23:42:09.375022: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-06-22 23:42:10.302274: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-06-22 23:42:10.322790: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3600000000 Hz
Traceback (most recent call last):
  File "train.py", line 351, in <module>
    main(**vars(args))
  File "train.py", line 317, in main
    policy.update(observes, actions, advantages, logger)  # update policy
  File "/home/wezardlza/workspace/trpo/policy.py", line 61, in update
    old_means, old_logvars, old_logp])
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1725, in train_on_batch
    class_weight)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 1513, in single_batch_iterator
    _check_data_cardinality(data)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 1529, in _check_data_cardinality
    raise ValueError(msg)
ValueError: Data cardinality is ambiguous:
  x sizes: 369, 369, 369, 369, 1, 369
Make sure all arrays contain the same number of samples.

After some checks, I found in file ./trpo/policy.py the below code caused the mismatched batch size

class PolicyNN(Layer):
    """ Neural net for policy approximation function.

    Policy parameterized by Gaussian means and variances. NN outputs mean
     action based on observation. Trainable variables hold log-variances
     for each action dimension (i.e. variances not determined by NN).
    """
    def build(self, input_shape):
        self.batch_sz = input_shape[0]

    def call(self, inputs, **kwargs):
        y = self.dense1(inputs)
        y = self.dense2(y)
        y = self.dense3(y)
        means = self.dense4(y)
        logvars = K.sum(self.logvars, axis=0, keepdims=True) + self.init_logvar
        logvars = K.tile(logvars, (self.batch_sz, 1))
        return [means, logvars]

which set the first dimension of logvars to be one during runtime constantly while the first dimension of inputs seems varied. Thus, based on the code above, the first dimension of means is also different from logvars which causes the error

  File "/home/wezardlza/workspace/trpo/policy.py", line 61, in update
    old_means, old_logvars, old_logp])

Thus, I do the following things: In file ./trpo/policy.py, add

from tensorflow import shape

and change logvars = K.tile(logvars, (self.batch_sz, 1)) to logvars = K.tile(logvars, (shape(inputs)[0], 1)). These helped me to pass the exmple

python train.py InvertedPendulumBulletEnv-v0

but it seems self.batch_sz will not be used anymore. Perhaps we can just change logvars = K.tile(logvars, (self.batch_sz, 1)) to logvars = K.tile(logvars, (shape(inputs)[0], 1)) and remove the build() method above? I am new to TensorFlow and would like to know whether my changes will cause any problems or even errors for the TRPO results. Thanks for help!

kimbring2 commented 2 years ago

@wezardlza I have the same issue.

You need to reshape the old_logvars value after 'old_means, old_logvars = self.policy(observes)' line.

You can do it by adding the below line.

old_logvars = K.tile(old_logvars, (observes.shape[0], 1))

I can see the mean reward is increased to 1000. (: