Open matthewfeickert opened 1 year ago
cc @jburnim, as it seems from the outside of Google world that you're mangaing tfp
releases(?). Apologies in advance for the noise if not, and thanks for the work that you've done regardless.
So
with
is breaking at
where
tf.ones
,tf.zeros
,tf.fill
,tf.ones_like
,tf.zeros_like
now take an additional Layout argument that controls the output layout of their results.
c.f. https://github.com/tensorflow/tensorflow/blob/v2.14.0/tensorflow/python/ops/array_ops.py#L3107-L3155
To make this more explicit, if you put a breakpoint()
immediatley after
you get
(venv) root@db7aec05ef72:/# python -c 'import tensorflow; import tensorflow_probability'
2023-09-27 01:44:50.273853: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-09-27 01:44:50.275376: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-27 01:44:50.298863: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-09-27 01:44:50.298894: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-09-27 01:44:50.298916: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-09-27 01:44:50.305111: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-27 01:44:50.305323: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-27 01:44:50.996801: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
> /venv/lib/python3.11/site-packages/tensorflow_probability/python/internal/prefer_static.py(84)_copy_docstring()
-> if original_spec != new_spec:
(Pdb) original_spec
FullArgSpec(args=['input', 'dtype', 'name', 'layout'], varargs=None, varkw=None, defaults=(None, None, None), kwonlyargs=[], kwonlydefaults=None, annotations={})
(Pdb) new_spec
FullArgSpec(args=['input', 'dtype', 'name'], varargs=None, varkw=None, defaults=(None, None), kwonlyargs=[], kwonlydefaults=None, annotations={})
(Pdb)
Oh, comparing TensorFlow Probability v0.21.0
_ones_like
to now
this was already taken care of by @rainwoodman in 2cbb82d0ed83078e6232020242799cde5cc41ce9. So this is already fixed, but a release is needed with the fix. :+1:
@jburnim, can you comment on when a release might be possible?
Thanks for pinging on this, @matthewfeickert ! Note that using tfp-nightly
in the meantime should work.
Right -- stable (ie non-nightly) TFP releases are generally tied to a particular stable TF release and won't generally work with a subsequent TF release. TFP nightlies are tested against tf-nightly and more likely to work with a recent TF stable release. We usually get a TFP release out within a week or two of a new TF version.
Note that using
tfp-nightly
in the meantime should work.
Indeed, I checked this after I noticed that the problem was already fixed given 2cbb82d0ed83078e6232020242799cde5cc41ce9.
TFP nightlies are tested against tf-nightly and more likely to work with a recent TF stable release. We usually get a TFP release out within a week or two of a new TF version.
Yes, I'm aware. I had assumed that the TensorFlow team would coordinate releases with the TFP team, as having a high probability of breaking all users is both generally bad and seems to further invalidate the view of the TensorFlow ecosystem being library-like.
I assume given the team's current schedule and other responsibilities O(weeks) can't be improved upon to O(days)?
For future reference, I've been aware since tensorflow
v2.14.0rc0
this release of TF would break TFP given nightly testing against rcs (here the *-nightly
wheels are obviously not useful). In the future should I open issues asking for a release to be prepared as soon as I see this and verify that a fix has already been committed? Or is the TFP team aware of this already internally through tests, and releases aren't intended to be coordinated, and this issue would just be noise?
Thanks, Matthew. I recognize and appreciate that you're a long-time user and contributor.
TF and TFP are maintained by quite separate groups; there is not very much explicit coordination, although we are proactively notified of upcoming releases as they happen.
TFP stable versions are tested and supported for the TF release that is current when they are built; strictly speaking there was no breakage here, because TFP 0.21 was never explicitly intended to work with TF 2.14. If a TFP version survives a TF release (rare) then that's a happy accident! Otherwise the fact that the nightlies are developed and tested more-or-less in lockstep ensures that there will be some releasable state of TFP near HEAD when a new TF drops. Our release process is essentially about finding that commit, branching, patching up any minor issues, generating release notes and pushing a pypi package. This usually takes a week or two for someone on the team to find time to do.
So from our perspective the current setup is "working as intended" (which is not to say it's the best it could be or that we could imagine -- but dev hours aren't free). I don't think filing bugs like this sooner would expedite that process.
Hope this extra detail provides some useful (if somewhat unsatisfying) context.
@csuter Could tfp
specify compatible tensorflow
version ranges to prevent such issues. Unit tests inside my container caught this issue upfront, however it would be good if pip install would just raise an error. Thank you
NOTE: TFP 0.22.0 has been released.
Summary of problem
Today's release of TensorFlow
v2.14.0
breaks TensorFlow Probability at import.In a fresh Python 3.11 virtual environment, installation of
tensorflow
v2.14.0
andtensorflow-probability
v0.21.0
causes aat import of both.
Reproducible example