openai / iaf

Code for reproducing key results in the paper "Improving Variational Inference with Inverse Autoregressive Flow"
https://arxiv.org/abs/1606.04934
MIT License
518 stars 133 forks source link

Update to support TF1.1 + #4

Open jramapuram opened 7 years ago

jramapuram commented 7 years ago

After some fixes to the summary & split calls (i.e. they were refactored in tf1.0) I still can't get this code to work:

(.venv) ➜  iaf git:(master) ✗ CIFAR10_PATH="./CIFAR10" optirun -b primus python tf_train.py --logdir ./logs --hpconfig depth=1,num_blocks=20,kl_min=0.1,learning_rate=0.002,batch_size=32 --num_gpus 1 --mode train

Traceback (most recent call last):
  File "tf_train.py", line 397, in <module>
    tf.app.run()
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "tf_train.py", line 391, in main
    run(hps)
  File "tf_train.py", line 237, in run
    model = CVAE1(hps, "train", x)
  File "tf_train.py", line 152, in __init__
    self.train_op = opt.apply_gradients(grad, global_step=self.global_step)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 446, in apply_gradients
    self._create_slots([_get_variable_for(v) for v in var_list])
  File "/home/jramapuram/Dropbox/projects/iaf/tf_utils/adamax.py", line 37, in _create_slots
    self._zeros_slot(v, "m", self._name)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 766, in _zeros_slot
    named_slots[_var_key(var)] = slot_creator.create_zeros_slot(var, op_name)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 174, in create_zeros_slot
    colocate_with_primary=colocate_with_primary)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 146, in create_slot_with_initializer
    dtype)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 66, in _create_slot_var
    validate_shape=validate_shape)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1049, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 948, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 356, in get_variable
    validate_shape=validate_shape, use_resource=use_resource)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
    use_resource=use_resource)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 671, in _get_single_variable
    "VarScope?" % name)
ValueError: Variable model/model/dec_log_stdv/Adamax/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

The same result utiziling opt = tf.train.AdamOptimizer(hps.learning_rate)

(.venv) ➜  iaf git:(master) ✗ CIFAR10_PATH="./CIFAR10" optirun -b primus python tf_train.py --logdir ./logs --hpconfig depth=1,num_blocks=20,kl_min=0.1,learning_rate=0.002,batch_size=32 --num_gpus 1 --mode train                                                                                                                                                                      

Traceback (most recent call last):
  File "tf_train.py", line 397, in <module>
    tf.app.run()
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "tf_train.py", line 391, in main
    run(hps)
  File "tf_train.py", line 237, in run
    model = CVAE1(hps, "train", x)
  File "tf_train.py", line 152, in __init__
    self.train_op = opt.apply_gradients(grad, global_step=self.global_step)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 446, in apply_gradients
    self._create_slots([_get_variable_for(v) for v in var_list])
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/adam.py", line 122, in _create_slots
    self._zeros_slot(v, "m", self._name)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 766, in _zeros_slot
    named_slots[_var_key(var)] = slot_creator.create_zeros_slot(var, op_name)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 174, in create_zeros_slot
    colocate_with_primary=colocate_with_primary)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 146, in create_slot_with_initializer
    dtype)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.py", line 66, in _create_slot_var
    validate_shape=validate_shape)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1049, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 948, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 356, in get_variable
    validate_shape=validate_shape, use_resource=use_resource)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
    use_resource=use_resource)
  File "/home/jramapuram/.venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 671, in _get_single_variable
    "VarScope?" % name)
ValueError: Variable model/model/dec_log_stdv/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

I tried setting reuse=None to no avail. I'm probably missing something stupid here. Here is my fork with the changes: https://github.com/jramapuram/iaf/tree/hotfix/tf1.0

jramapuram commented 7 years ago

Looks like it is due to this hack:

        # XXX(rafal): TensorFlow bug?? Default initializer should handle things well..
        ses.run(init_model.h_top.initializer)
havaeimo commented 6 years ago

@jramapuram did you find what was causing the problem?

havaeimo commented 6 years ago

@dpkingma @TimSalimans Do you know which version of TF the code was written in ?

jramapuram commented 6 years ago

@havaeimo : Nope, I gave up on this. Looks like it was written pre tf1.0? I'v moved in to pytorch and use iaf from Jakub Tomczak's github :https://github.com/jmtomczak/vae_vpflows

havaeimo commented 6 years ago

I managed to make the code running with tf 0.9, cuda 7.5 and cudnn 4