blei-lab / edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
http://edwardlib.org
Other
4.83k stars 759 forks source link

Compatibility with TF > 1.7.0 #891

Open IanQS opened 6 years ago

IanQS commented 6 years ago

Follows on https://github.com/blei-lab/edward/issues/882

Specifically

2018-05-01-221624_1600x1668_scrot

I'm assuming your comment about "just removing it" was tongue in cheek but I figured I'd give it a shot anyways

Running pytest tests

I got

78 failed, 248 passed, 49 warnings

where the relevant errors (basically anywhere there was a copy) are as follows


../edward/util/random_variables.py:321: in copy                                                                                                                                                  
    new_op = copy(op, dict_swap, scope, True, copy_q, False)                                                                                                                                     
../edward/util/random_variables.py:374: in copy                                                                                                                                                  
    op_def)                                                                                                                                                                                      
/home/ian/.virtualenvs/edward/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1732: in __init__                                                                                   
    op_def, inputs, node_def.attr)                                                                                                                                                               
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'Operation' object has no attribute '_c_op'") raised in repr()] Operation object at 0x7fef20988e48>                                                                     
op_def = name: "Mul"                                                                                                                                                                             
input_arg {                                                                                                                                                                                      
  name: "x"                                                                                                                                                                                      
  type_attr: "T"                                                                                                                                                                                 
}                                                                                                                                                                                                
input_arg {                                                                                                                                                                                      
  name: "y"                                                                                                                                                                                      
  type_attr: "T"                                                                                                                                                                                 
}                                                                                                                                                                                                
output_arg {                                                                                                                                                                                     
  name:...ype: DT_INT32                                                                                                                                                                          
      type: DT_INT64 
      type: DT_COMPLEX64                                                                                                                                                               
      type: DT_COMPLEX128
    }
  }
}
is_commutative: true

inputs = [], attrs = <google.protobuf.pyext._message.MessageMapContainer object at 0x7fef153103f0>                                                                                               

    def _reconstruct_sequence_inputs(self, op_def, inputs, attrs):
      """Regroups a flat list of input tensors into scalar and sequence inputs.

        Args:
          op_def: The `op_def_pb2.OpDef` (for knowing the input types)
          inputs: a list of input `Tensor`s to the op.
          attrs: mapping from attr name to `attr_value_pb2.AttrValue` (these define
            how long each sequence is)

        Returns:
          A list of `Tensor`s (corresponding to scalar inputs) and lists of
          `Tensor`s (corresponding to sequence inputs).
        """
      grouped_inputs = []
      i = 0
      for input_arg in op_def.input_arg:
        if input_arg.number_attr:
          input_len = attrs[input_arg.number_attr].i
          is_sequence = True
        elif input_arg.type_list_attr:
          input_len = len(attrs[input_arg.type_list_attr].list.type)
          is_sequence = True
        else:
          input_len = 1
          is_sequence = False

        if is_sequence:
          grouped_inputs.append(inputs[i:i + input_len])
        else:
>         grouped_inputs.append(inputs[i])
E         IndexError: list index out of range

/home/ian/.virtualenvs/edward/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1803: IndexError 
IanQS commented 6 years ago

It's beginning to look more and more like it's out of my skill set so I might just downgrade my tensorflow version

IanQS commented 6 years ago

Following up on this:

1) tensorflow 1.6 (and v1.7.0, or at least it does when I do a checkout for it) still support set_shapes_for_outputs)

2) the current master does not support it. However, it DOES have set_shape_and_handle_data_for_outputs which I will be investigating

EDIT:

No dice, still fails all the same tests

dustinvtran commented 6 years ago

Thanks for looking into this. It does seem to be a non-trivial solution.

IanQS commented 6 years ago

@dustinvtran

Sorry, I know you're crazy busy with things, but I am curious about what happens in situations like this.

Would one open an issue on the TF side? That seems like the most straightforward method. I'd imagine the project would also get an insight into what things might / might not be deprecated soon so this situation doesn't happen again but I'll leave it to your best judgement.

dustinvtran commented 6 years ago

Yeah, sorry for the delays. These past weeks have been especially busy due to NIPS. The TF best practice is to not rely on internal functions; unfortunately we were forced to in order to perform this graph copying. Hopefully there's a solution we can work out on our own end (even if it means duplicating code from TF).

IanQS commented 6 years ago

even if it means duplicating code from TF

Unless Edward duplicates large chunks of TF I don't know if that's the best practice here.

While I was looking through the surrounding code, and the functions that called _set_shapes_for_outputs_c_api I got the impression that things were still moving around and the internals were far from being settled.

These past weeks have been especially busy due to NIPS

All the best!

dustinvtran commented 6 years ago

I agree. I'm looking into it now and that seems to be the problem with duplicating code. It looks like the tf.contrib.graph_editor ran into the same problems due to the C API change. Perhaps we can use their solution, or ideally, move over to use their tools overall?

IanQS commented 6 years ago

I'm swamped with stuff but can probably look into that this weekend / early next week. I'll keep you updated on what I find when I look into tf.contrib.graph_editor

CMoebus commented 6 years ago

Hi, I'm a newcomer to Edwards and ran in the same error message. I installed Edward and Tensorflow on a Win10 Surface with the pip commands recommended in the Edward - Getting Started - website within the Anaconda prompt. Best, Claus

IanQS commented 6 years ago

@CMoebus

I'm not familiar with Anaconda so I can't help you there but if it's possible to specify versions, you want to use Tensorflow 1.6 or prior with Edward.

@dustinvtran

are you sure that tf.contrib.graph_editor used set_shapes_for_outputs? I was looking through tf 1.6 and it didn't look like there were any calls there

dvassilev2 commented 6 years ago

Hi guys, any kind of ETA on how long this fix might take? I'm reluctant to downgrade my Tensorflow version given just how particular it is with respect to Linux graphics driver version, CUDA version etc. it is. I'd rather not mess with it if you think a fix is < 2 weeks away.