`%tensorboard` doesn’t work with `%tensorflow_version 1.x` (duplicate plugins for name whatif)

MeghnaNatraj commented 4 years ago

Every few months, the colab tutorials released by my team seem to break due to updates made to the Colab environment. The reason is due to multiple tensorboard versions being installed.

As a result of this, I run the following code snippet before running TensorBoard each time:

# Remove all TensorBoard packages.
! pip list --format=freeze | grep tensorboard | xargs pip uninstall -y
# Install TensorFlow again (This command will only install the default TensorBoard package associated with this TensorFlow package). 
! pip install -q tensorflow

Seems like many users also face this issue often: https://github.com/pytorch/pytorch/issues/22676

Not sure if this is a Colab or a Tensorboard issue, but I'm posting it here.

MeghnaNatraj commented 4 years ago

If i run it without the above command, i get this error:

%load_ext tensorboard
%tensorboard --logdir {LOGS_DIR}

ERROR: Failed to launch TensorBoard (exited with 1).
Contents of stderr:
Traceback (most recent call last):
  File "/usr/local/bin/tensorboard", line 8, in <module>
    sys.exit(run_main())
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/main.py", line 64, in run_main
    app.run(tensorboard.main, flags_parser=tensorboard.configure)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 220, in main
    server = self._make_server()
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 301, in _make_server
    self.assets_zip_provider)
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 150, in standard_tensorboard_wsgi
    flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer)
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 202, in TensorBoardWSGIApp
    return TensorBoardWSGI(tbplugins, flags.path_prefix)
  File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 254, in __init__
    raise ValueError('Duplicate plugins for name %s' % plugin.plugin_name)
ValueError: Duplicate plugins for name whatif

This is because of the new package tensorboard-plugin-wit released in Feb 2020. It's causing issues to many people here (https://github.com/pytorch/pytorch/issues/22676), and there can be other updates in the future as well.

You can run the following command to find all tensorboard packages installed in the Colab environment:

! pip list --format=freeze | grep tensorboard

tensorboard==1.15.0
tensorboard-plugin-wit==1.6.0.post2 # causes the issue
tensorboardcolab==0.0.22

wchargin commented 4 years ago

Hi @MeghnaNatraj! I can reproduce this error by running

%tensorflow_version 1.x
%load_ext tensorboard
%tensorboard --logdir logs

in a blank notebook with a fresh Colab runtime.

Could you please point us to an example notebook that runs into this problem? I looked at a few TF Lite Colabs (flowers_tf_lite.ipynb, text_classification.ipynb, image_classification.ipynb) but didn’t find any that used TensorBoard. It would be great to verify that the fixes that we put in actually work for your use case.

It looks like the problem is that %tensorflow_version 1.x adds an entry to the Python path for TF 1.x, which suffices for new or superseding versions of packages, but doesn’t suffice to remove packages that must not exist in 1.x, like tensorboard_plugin_wit. I’ll see if we can fix this on the Colab side, and failing that I’ll look into whether we might want to backport a patch to 1.15.

wchargin commented 4 years ago

Actually, the simplest fix would be to update the notebooks in question to use TensorFlow 2.x, which just doesn’t have this problem because no path manipulation is required. It doesn’t look to me like there are any tutorials in tensorflow, docs, or tensorboard that use both %tensorflow_version 1.x and %tensorboard. Is upgrading the tutorials an option, now that TensorFlow 2.x has been out for about half a year?

MeghnaNatraj commented 4 years ago

The notebook I'm referring to is this one: https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/micro_speech/train_speech_model.ipynb

Let me know if you face an issue as well. The issue is that we need to use tensorflow==1.15 for a few more weeks/months. As a result of this I uninstall the default TensorFlow 2.x and install tensorflow1.15.

What do you suggest I do? Use this workaround until we move to TF2.x?

wchargin commented 4 years ago

Okay, it looks like that notebook has a lot of custom setup rather than using %tensorflow_version 1.x at all:

!pip uninstall -y tensorflow tensorflow_estimator tensorboard
!pip install -q tf-estimator-nightly==1.14.0.dev2019072901 tf-nightly-gpu==1.15.0.dev20190729

Given that this custom setup is already required, it seems reasonable to also uninstall tensorboard-plugin-wit as a workaround. That or upgrading to TF 2.x are probably your best bets for now. I’ve opened an internal bug with the Colab team (http://b/152986612; CCed you), but it’s not clear whether this will ever be fixed since it only affects non-clean installs of TF 1.x.

(FWIW, I can’t actually reproduce this error there; instead, I see an error “ModuleNotFoundError: No module named 'tensorboard'”, which suggests that the custom setup did not install TensorBoard correctly.)

MeghnaNatraj commented 4 years ago

@wchargin I haven't checked in my updates yet, but replace the cell you've pasted above, i.e, get rid of all those uninstall/install commands with the following code: %tensorflow_version 1.x However, as you've posted initially as well, we still get the same error.

Currently, only this code snippet works in order to use TF1.x:

# Remove all TensorBoard packages.
! pip list --format=freeze | grep tensorboard | xargs pip uninstall -y
# Install TensorFlow again (This command will only install the default TensorBoard package associated with this TensorFlow package). 
! pip install -q tensorflow

If there can be a fix, it can help other users too! i had faced this issue before as well. But if not, i can use the workaround as well.

wchargin commented 4 years ago

Ah, I understand now; thanks. Yes, it would definitely be nice if this could be fixed. I think it’s just a question of tradeoffs. It sounds like it would be a fair amount of work to fix this properly on either the Colab side or the TensorBoard side. For TensorBoard, at least, we occasionally push patch releases with bug fixes to the current release series, but I don’t think we’ve ever backported a change to an old version.

I’ll solicit opinions from the rest of the TensorBoard team to see what people think, and then get back to you here.

jameswex commented 4 years ago

See tensorboard-plugin-wit mitigation in https://github.com/PAIR-code/what-if-tool/pull/64

jameswex commented 4 years ago

tensorboard_plugin_wit-1.6.0.post3 has been uploaded to PyPi and includes a workaround for this issue.

wchargin commented 4 years ago

@jameswex: Thank you for the quick fix and release! Confirmed that this works in Colab when the new package is installed:

Screenshot of working TensorBoard 1.x in Colab with `pip install -U
tensorboard-plugin-wit`

I’ll ask the Colab team to update the base image so that it works by default.

MeghnaNatraj commented 4 years ago

Thank you so much for the fix! Is there an ETA on when tensorboard_plugin_wit-1.6.0.post3 would be available in the default colab environment?

arya46 commented 4 years ago

I am still facing this issue. My tf version is 2.1. My pip list:

Click here

- absl-py 0.9.0 - alabaster 0.7.12 - albumentations 0.1.12 - altair 4.1.0 - asgiref 3.2.7 - astor 0.8.1 - astropy 4.0.1.post1 - astunparse 1.6.3 - atari-py 0.2.6 - atomicwrites 1.3.0 - attrs 19.3.0 - audioread 2.1.8 - autograd 1.3 - Babel 2.8.0 - backcall 0.1.0 - beautifulsoup4 4.6.3 - bleach 3.1.4 - blis 0.4.1 - bokeh 1.4.0 - boto3 1.12.35 - botocore 1.15.35 - Bottleneck 1.3.2 - branca 0.4.0 - bs4 0.0.1 - CacheControl 0.12.6 - cachetools 3.1.1 - catalogue 1.0.0 - certifi 2019.11.28 - cffi 1.14.0 - chainer 6.5.0 - chardet 3.0.4 - click 7.1.1 - cloudpickle 1.3.0 - cmake 3.12.0 - cmdstanpy 0.4.0 - colorlover 0.3.0 - community 1.0.0b1 - contextlib2 0.5.5 - convertdate 2.2.0 - coverage 3.7.1 - coveralls 0.5 - crcmod 1.7 - cufflinks 0.17.3 - cupy-cuda101 6.5.0 - cvxopt 1.2.4 - cvxpy 1.0.29 - cycler 0.10.0 - cymem 2.0.3 - Cython 0.29.16 - daft 0.0.4 - dask 2.12.0 - dataclasses 0.7 - datascience 0.10.6 - decorator 4.4.2 - defusedxml 0.6.0 - descartes 1.1.0 - dill 0.3.1.1 - distributed 1.25.3 - Django 3.0.5 - dlib 19.18.0 - docopt 0.6.2 - docutils 0.15.2 - dopamine-rl 1.0.5 - earthengine-api 0.1.217 - easydict 1.9 - ecos 2.0.7.post1 - editdistance 0.5.3 - en-core-web-sm 2.2.5 - entrypoints 0.3 - ephem 3.7.7.1 - et-xmlfile 1.0.1 - fa2 0.3.5 - fancyimpute 0.4.3 - fastai 1.0.60 - fastdtw 0.3.4 - fastprogress 0.2.2 - fastrlock 0.4 - fbprophet 0.6 - feather-format 0.4.0 - featuretools 0.4.1 - filelock 3.0.12 - firebase-admin 4.0.1 - fix-yahoo-finance 0.0.22 - Flask 1.1.1 - folium 0.8.3 - fsspec 0.7.1 - future 0.16.0 - gast 0.2.2 - GDAL 2.2.2 - gdown 3.6.4 - gensim 3.6.0 - geographiclib 1.50 - geopy 1.17.0 - gin-config 0.3.0 - glob2 0.7 - google 2.0.3 - google-api-core 1.16.0 - google-api-python-client 1.7.12 - google-auth 1.7.2 - google-auth-httplib2 0.0.3 - google-auth-oauthlib 0.4.1 - google-cloud-bigquery 1.21.0 - google-cloud-core 1.0.3 - google-cloud-datastore 1.8.0 - google-cloud-firestore 1.6.2 - google-cloud-language 1.2.0 - google-cloud-storage 1.18.1 - google-cloud-translate 1.5.0 - google-colab 1.0.0 - google-pasta 0.2.0 - google-resumable-media 0.4.1 - googleapis-common-protos 1.51.0 - googledrivedownloader 0.4 - graphviz 0.10.1 - grpcio 1.27.2 - gspread 3.0.1 - gspread-dataframe 3.0.5 - gym 0.17.1 - h5py 2.10.0 - HeapDict 1.0.1 - holidays 0.9.12 - html5lib 1.0.1 - httpimport 0.5.18 - httplib2 0.17.1 - httplib2shim 0.0.3 - humanize 0.5.1 - hyperopt 0.1.2 - ideep4py 2.0.0.post3 - idna 2.8 - image 1.5.28 - imageio 2.4.1 - imagesize 1.2.0 - imbalanced-learn 0.4.3 - imblearn 0.0 - imgaug 0.2.9 - importlib-metadata 1.6.0 - imutils 0.5.3 - inflect 2.1.0 - intel-openmp 2020.0.133 - intervaltree 2.1.0 - ipykernel 4.6.1 - ipython 5.5.0 - ipython-genutils 0.2.0 - ipython-sql 0.3.9 - ipywidgets 7.5.1 - itsdangerous 1.1.0 - jax 0.1.62 - jaxlib 0.1.42 - jdcal 1.4.1 - jedi 0.16.0 - jieba 0.42.1 - Jinja2 2.11.1 - jmespath 0.9.5 - joblib 0.14.1 - jpeg4py 0.1.4 - jsonschema 2.6.0 - jupyter 1.0.0 - jupyter-client 5.3.4 - jupyter-console 5.2.0 - jupyter-core 4.6.3 - kaggle 1.5.6 - kapre 0.1.3.1 - Keras 2.2.5 - Keras-Applications 1.0.8 - Keras-Preprocessing 1.1.0 - keras-vis 0.4.1 - kiwisolver 1.2.0 - knnimpute 0.1.0 - librosa 0.6.3 - lightgbm 2.2.3 - llvmlite 0.31.0 - lmdb 0.98 - lucid 0.3.8 - LunarCalendar 0.0.9 - lxml 4.2.6 - Markdown 3.2.1 - MarkupSafe 1.1.1 - matplotlib 3.2.1 - matplotlib-venn 0.11.5 - missingno 0.4.2 - mistune 0.8.4 - mizani 0.6.0 - mkl 2019.0 - mlxtend 0.14.0 - more-itertools 8.2.0 - moviepy 0.2.3.5 - mpmath 1.1.0 - msgpack 1.0.0 - multiprocess 0.70.9 - multitasking 0.0.9 - murmurhash 1.0.2 - music21 5.5.0 - natsort 5.5.0 - nbconvert 5.6.1 - nbformat 5.0.5 - networkx 2.4 - nibabel 3.0.2 - nltk 3.2.5 - notebook 5.2.2 - np-utils 0.5.12.1 - numba 0.47.0 - numexpr 2.7.1 - numpy 1.18.2 - nvidia-ml-py3 7.352.0 - oauth2client 4.1.3 - oauthlib 3.1.0 - okgrade 0.4.3 - opencv-contrib-python 4.1.2.30 - opencv-python 4.1.2.30 - openpyxl 2.5.9 - opt-einsum 3.2.0 - osqp 0.6.1 - packaging 20.3 - palettable 3.3.0 - pandas 1.0.3 - pandas-datareader 0.8.1 - pandas-gbq 0.11.0 - pandas-profiling 1.4.1 - pandocfilters 1.4.2 - parso 0.6.2 - pathlib 1.0.1 - patsy 0.5.1 - pexpect 4.8.0 - pickleshare 0.7.5 - Pillow 7.0.0 - pip 19.3.1 - pip-tools 4.5.1 - plac 1.1.3 - plotly 4.4.1 - plotnine 0.6.0 - pluggy 0.7.1 - portpicker 1.3.1 - prefetch-generator 1.0.1 - preshed 3.0.2 - prettytable 0.7.2 - progressbar2 3.38.0 - prometheus-client 0.7.1 - promise 2.3 - prompt-toolkit 1.0.18 - protobuf 3.10.0 - psutil 5.4.8 - psycopg2 2.7.6.1 - ptvsd 5.0.0a12 - ptyprocess 0.6.0 - py 1.8.1 - pyarrow 0.14.1 - pyasn1 0.4.8 - pyasn1-modules 0.2.8 - pycocotools 2.0.0 - pycparser 2.20 - pydata-google-auth 0.3.0 - pydot 1.3.0 - pydot-ng 2.0.0 - pydotplus 2.0.2 - PyDrive 1.3.1 - pyemd 0.5.1 - pyglet 1.5.0 - Pygments 2.1.3 - pygobject 3.26.1 - pymc3 3.7 - PyMeeus 0.3.7 - pymongo 3.10.1 - pymystem3 0.2.0 - PyOpenGL 3.1.5 - pyparsing 2.4.6 - pyrsistent 0.16.0 - pysndfile 1.3.8 - PySocks 1.7.1 - pystan 2.19.1.1 - pytest 3.6.4 - python-apt 1.6.5+ubuntu0.2 - python-chess 0.23.11 - python-dateutil 2.8.1 - python-louvain 0.13 - python-slugify 4.0.0 - python-utils 2.4.0 - pytz 2018.9 - PyWavelets 1.1.1 - PyYAML 3.13 - pyzmq 17.0.0 - qtconsole 4.7.2 - QtPy 1.9.0 - regex 2019.12.20 - requests 2.21.0 - requests-oauthlib 1.3.0 - resampy 0.2.2 - retrying 1.3.3 - rpy2 3.2.7 - rsa 4.0 - s3fs 0.4.2 - s3transfer 0.3.3 - scikit-image 0.16.2 - scikit-learn 0.22.2.post1 - scipy 1.4.1 - screen-resolution-extra 0.0.0 - scs 2.1.2 - seaborn 0.10.0 - Send2Trash 1.5.0 - setuptools 46.1.3 - setuptools-git 1.2 - Shapely 1.7.0 - simplegeneric 0.8.1 - six 1.12.0 - sklearn 0.0 - sklearn-pandas 1.8.0 - smart-open 1.10.0 - snowballstemmer 2.0.0 - sortedcontainers 2.1.0 - spacy 2.2.4 - Sphinx 1.8.5 - sphinxcontrib-websupport 1.2.1 - SQLAlchemy 1.3.15 - sqlparse 0.3.1 - srsly 1.0.2 - statsmodels 0.10.2 - sympy 1.1.1 - tables 3.4.4 - tabulate 0.8.7 - tblib 1.6.0 - tensorboard 2.1.1 - tensorboard-plugin-wit 1.6.0.post2 - tensorboardcolab 0.0.22 - tensorflow 2.1.0 - tensorflow-addons 0.8.3 - tensorflow-datasets 2.1.0 - tensorflow-estimator 2.1.0 - tensorflow-gcs-config 2.1.8 - tensorflow-hub 0.8.0 - tensorflow-metadata 0.21.1 - tensorflow-privacy 0.2.2 - tensorflow-probability 0.9.0 - termcolor 1.1.0 - terminado 0.8.3 - testpath 0.4.4 - text-unidecode 1.3 - textblob 0.15.3 - textgenrnn 1.4.1 - Theano 1.0.4 - thinc 7.4.0 - toolz 0.10.0 - torch 1.4.0 - torchsummary 1.5.1 - torchtext 0.3.1 - torchvision 0.5.0 - tornado 4.5.3 - tqdm 4.38.0 - traitlets 4.3.3 - tweepy 3.6.0 - typeguard 2.7.1 - typing 3.6.6 - typing-extensions 3.6.6 - tzlocal 1.5.1 - umap-learn 0.3.10 - uritemplate 3.0.1 - urllib3 1.24.3 - vega-datasets 0.8.0 - wasabi 0.6.0 - wcwidth 0.1.9 - webencodings 0.5.1 - Werkzeug 1.0.1 - wheel 0.34.2 - widgetsnbextension 3.5.1 - wordcloud 1.5.0 - wrapt 1.12.1 - xarray 0.15.1 - xgboost 0.90 - xkit 0.0.0 - xlrd 1.1.0 - xlwt 1.3.0 - yellowbrick 0.9.1 - zict 2.0.0 - zipp 3.1.0

I tried to run !pip install tensorboard_plugin_wit

it produced the output:

Requirement already satisfied: tensorboard_plugin_wit in /usr/local/lib/python3.6/dist-packages (1.6.0.post2)

But, tensorboard is still throwing the error. How to fix or workaround this issue?

wchargin commented 4 years ago

@MeghnaNatraj: This should roll out in the next few days. (The change has been submitted internally and just needs to be deployed.)

@arya46: You almost got it :-) !pip install -U tensorboard_plugin_wit, with -U for “upgrade”.

arya46 commented 4 years ago

@wchargin Thank you for pointing out my mistake. Yes, it did solve the issue.

MeghnaNatraj commented 4 years ago

@wchargin thank you so much for the update! :)

ghost commented 4 years ago

@wchargin That did it for me as well, after days of suffering ;) Was this a Google Colab update issue and (not sure if it is) if this problem was for all TF 1.0 users using Tensorboard (which would be quite a lot i can imagne) wouldn't it be fixed/reverted already? I heard you saying the patch will release soon?

wchargin commented 4 years ago

@jandevries123: It’s an issue due to how the %tensorflow_version magic works, which is as follows:

The Colab images have TensorFlow 2.x installed to the default Python path, and also have TensorFlow 1.x installed under a separate directory that’s not on the default path.

When you run %tensorflow_version 1.x, your PATH, PYTHONPATH, and sys.path are updated to prepend the 1.x directory. When you run %tensorflow_version 2.x, it’s popped off the path. Thus:

import os
print(os.environ["PATH"])
%tensorflow_version 1.x
print(os.environ["PATH"])
%tensorflow_version 2.x
print(os.environ["PATH"])

/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin:/opt/bin
TensorFlow 1.x selected.
/tensorflow-1.15.2/python3.6/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin:/opt/bin
TensorFlow 2.x selected.
/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin:/opt/bin

Consequently, in TensorFlow 1.x mode, actually both versions of TensorFlow are on the path, but the 1.x packages are earlier in the path, so when there is a conflict the 1.x packages take precedence, as desired.

This works fine as long as all you care about is changing the versions of installed packages in the two environments without changing which packages are installed. The problem occurs when there is a package installed in the 2.x environment that must not be available in the 1.x environment: prepending the 1.x directory to the path won’t actually remove such a package.

The tensorboard-plugin-wit==1.6.0post2 package falls into this category. The tensorboard-plugin-wit==1.6.0post3 package does not fall into this category: it’s compatible with both 1.x and 2.x. So what we did was just update the Colab base image to use 1.6.0post3 instead of 1.6.0post2. The package will still be available in both environments, but it won’t cause any problems.

wchargin commented 4 years ago

Should be deployed in prod:

Screenshot of repro, now working

prataplyf commented 4 years ago

pip install -q tf-estimator-nightly==1.14.0.dev2019072901 tf-nightly-gpu==1.15.0.dev20190729

Hi wchargin, I try to install and reinstall tensorflow==1.15. but gives me an error like:

ERROR: Could not find a version that satisfies the requirement tf-nightly-gpu==1.15.0.dev20190729 (from versions: 2.4.0.dev20200903, 2.4.0.dev20200904, 2.4.0.dev20200905, 2.4.0.dev20200906, 2.4.0.dev20200907, 2.4.0.dev20200908, 2.4.0.dev20200911, 2.4.0.dev20200912, 2.4.0.dev20200913, 2.4.0.dev20200914, 2.4.0.dev20200915, 2.4.0.dev20200916, 2.4.0.dev20200917, 2.4.0.dev20200918, 2.4.0.dev20200919, 2.4.0.dev20200920, 2.4.0.dev20200921, 2.4.0.dev20200922, 2.4.0.dev20200923, 2.4.0.dev20200924, 2.4.0.dev20200925, 2.4.0.dev20200926, 2.4.0.dev20200927, 2.4.0.dev20200928, 2.4.0.dev20200929, 2.4.0.dev20200930, 2.4.0.dev20201001, 2.4.0.dev20201002, 2.4.0.dev20201003, 2.4.0.dev20201004, 2.4.0.dev20201005, 2.4.0.dev20201007, 2.4.0.dev20201008, 2.4.0.dev20201010, 2.4.0.dev20201011, 2.4.0.dev20201012, 2.4.0.dev20201014, 2.4.0.dev20201015, 2.4.0.dev20201016, 2.4.0.dev20201017, 2.4.0.dev20201018, 2.4.0.dev20201019, 2.4.0.dev20201020, 2.4.0.dev20201021, 2.4.0.dev20201022, 2.4.0.dev20201023, 2.5.0.dev20201024, 2.5.0.dev20201025, 2.5.0.dev20201026)
ERROR: No matching distribution found for tf-nightly-gpu==1.15.0.dev20190729

I am using Python 3.8

wchargin commented 4 years ago

You’re trying to use a very old tf-nightly, from more than a year ago, with a version of Python that wasn’t even released at that time. That’s why there’s no matching version.

You should no longer need a workaround. I just checked and this issue is still fixed in prod:

Screenshot of `%tensorflow_version 1.x; %load_ext tensorboard;
%tensorboard --logdir /tmp/logs` working in Colab

tensorflow / tensorboard

`%tensorboard` doesn’t work with `%tensorflow_version 1.x` (duplicate plugins for name whatif) #3460