Open asorie opened 5 years ago
@Asorie Can you please try running this script in your notebook and let me know if you are facing the same issue?
I tried the script , but at section 4. the line %tensorboard --logdir logs/hparam_tuning
procudes an error:
ERROR: Failed to launch TensorBoard (exited with 1).
Contents of stderr:
Traceback (most recent call last):
File "/usr/local/bin/tensorboard", line 10, in <module>
sys.exit(run_main())
File "/usr/local/lib/python3.6/dist-packages/tensorboard/main.py", line 64, in run_main
app.run(tensorboard.main, flags_parser=tensorboard.configure)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 220, in main
server = self._make_server()
File "/usr/local/lib/python3.6/dist-packages/tensorboard/program.py", line 299, in _make_server
self.assets_zip_provider)
File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 160, in standard_tensorboard_wsgi
flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer)
File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 228, in TensorBoardWSGIApp
return TensorBoardWSGI(tbplugins, flags.path_prefix)
File "/usr/local/lib/python3.6/dist-packages/tensorboard/backend/application.py", line 279, in __init__
raise ValueError('Duplicate plugins for name %s' % plugin.plugin_name)
ValueError: Duplicate plugins for name projector
Its because there might be multiple versions of Tensorboard in your system. Please find my github gist here
I am able to see all the hyperparameters on Tensorboard using Tensorflow 2.0. There might be an issue with your tensorboard. Please try to run the same script in your system and see if you can see hparams displayed or no. Thanks!
The script works. I think the problem is, that I tried to add new HP and write the logs to an already used tensorboard.
Yes. So, I think the problem here is resolved?
Not really. I think tensorboard should look for the HP used and add new to the table if a new HP was found.
If this line:
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd']))
gets changed to:
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd', 'RMSprop']))
and then trained to the same logdir, tensorboard doenst add this new model to the hparams table.
So it isn't possible to dynamically change the possible hparams in the same logdir?
I think I'm facing the same issue. Any updates here?
This issue, in particular, https://github.com/tensorflow/tensorboard/issues/2743#issuecomment-542057891, very much reminds me of #3597. There, the problem is that mixed-type (string + float, meaning some models use a string value, others a numerical value) parameters are all cast to string, but the filter in list_session_groups.py
doesn't take that casting into account - it looks for 2.0
and doesn't find "2.0"
. As a result, only models with string parameter values are found - the other ones just don't show up. I have never used hp.HParam
myself, so I cannot say if the two HP_OPTIMIZER
s are seen as different types, but it sure feels like a similar issue.
I'm having the same issue, I am using torch + PPO in rllib and only half of my hyperparams show on tensorboard
I've had the same issue with TensorboardX, the reason was that the metric name contained a whitespace.
I met the same issue, when the number of hparams is getting large the issue appears.
Still an issue for me. Really annoying. Anyone have a solution?
I guess I will try to write Hparams structure to other file and replace that every time I change something. Not sure this works though
The original issue description here suggests the issue appears when "many hparams" are used. Then later it seems to be that users are trying to "add new HP and write the logs to an already used tensorboard".
So I'm not sure I'm understanding what the issue is. Are you logging more hparams data to the same log dir, and you want TB to read it? Does starting tensorboard again like tensorboard --logdir path/to/logs
show everything you want to see? Do you have a small example to reproduce the issue?
@arcra it probably covers only one aspect of this issue, but https://github.com/tensorflow/tensorboard/issues/3597#issuecomment-1490793918 has very specific repro steps that I created "only" 7 months ago ("only" compared to the 4 years that this issue has been open).
Consider Stack Overflow for getting support using TensorBoard—they have a larger community with better searchability:
https://stackoverflow.com/questions/tagged/tensorboard
Do not use this template for for setup, installation, or configuration issues. Instead, use the “installation problem” issue template:
https://github.com/tensorflow/tensorboard/issues/new?template=installation_problem.md
To report a problem with TensorBoard itself, please fill out the remainder of this template.
Environment information (required)
Diagnostics output
`````` --- check: autoidentify INFO: diagnose_tensorboard.py version 4725c70c7ed724e2d1b9ba5618d7c30b957ee8a4 --- check: general INFO: sys.version_info: sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0) INFO: os.name: nt INFO: os.uname(): N/A INFO: sys.getwindowsversion(): sys.getwindowsversion(major=10, minor=0, build=14393, platform=2, service_pack='') --- check: package_management INFO: has conda-meta: False INFO: $VIRTUAL_ENV: 'C:\\tensorflow_anduin' --- check: installed_packages INFO: installed: tensorboard==2.0.0 INFO: installed: tensorflow-gpu==2.0.0 INFO: installed: tensorflow-estimator==2.0.0 --- check: tensorboard_python_version INFO: tensorboard.version.VERSION: '2.0.0' --- check: tensorflow_python_version 2019-10-08 14:40:42.620638: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll INFO: tensorflow.__version__: '2.0.0' INFO: tensorflow.__git_version__: 'v2.0.0-rc2-26-g64c3d382ca' --- check: tensorboard_binary_path INFO: which tensorboard: b'C:\\tensorflow_anduin\\Scripts\\tensorboard.exe\r\n' --- check: readable_fqdn INFO: socket.getfqdn(): '...' --- check: stat_tensorboardinfo INFO: directory: C:\Users\halle\AppData\Local\Temp\.tensorboard-info INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=3096224744103339, st_dev=2217911477, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1570538160, st_mtime=1570538160, st_ctime=1562760637) INFO: mode: 0o40777 --- check: source_trees_without_genfiles INFO: tensorboard_roots (1): ['C:\\tensorflow_anduin\\lib\\site-packages']; bad_roots (0): [] --- check: full_pip_freeze INFO: pip freeze --all: absl-py==0.7.1 adal==1.2.2 asn1crypto==0.24.0 astor==0.8.0 astroid==2.2.5 avro-python3==1.9.1 azure-common==1.1.23 azure-graphrbac==0.53.0 azure-keyvault==1.1.0 azure-mgmt-authorization==0.51.1 azure-mgmt-containerregistry==2.7.0 azure-mgmt-keyvault==1.1.0 azure-mgmt-msi==0.2.0 azure-mgmt-nspkg==3.0.2 azure-mgmt-resource==2.2.0 azure-mgmt-storage==3.1.1 azure-nspkg==3.0.2 azure-storage-blob==1.5.0 azure-storage-common==1.4.2 blinker==1.4 boto3==1.9.238 botocore==1.12.238 cachetools==3.1.1 certifi==2019.9.11 cffi==1.12.3 chardet==3.0.4 Click==7.0 click-completion==0.5.1 clipboard==0.0.4 colorama==0.3.9 cryptography==2.7 cycler==0.10.0 docker==3.7.3 docker-pycreds==0.4.0 docutils==0.15.2 Flask==1.1.1 flatten-json==0.1.7 gast==0.2.2 gitdb2==2.0.6 GitPython==2.1.14 google-api-core==1.14.2 google-auth==1.6.3 google-cloud-core==1.0.3 google-cloud-kms==1.2.1 google-cloud-storage==1.20.0 google-pasta==0.1.7 google-resumable-media==0.4.1 googleapis-common-protos==1.6.0 grpc-google-iam-v1==0.12.3 grpcio==1.22.0 h5py==2.9.0 httplib2==0.14.0 humanize==0.5.1 idna==2.8 imageio==2.5.0 isodate==0.6.0 isort==4.3.21 itsdangerous==1.1.0 Jinja2==2.10.1 jmespath==0.9.4 Keras==2.2.4 Keras-Applications==1.0.8 Keras-Preprocessing==1.1.0 kiwisolver==1.1.0 lazy-object-proxy==1.4.1 lockfile==0.12.2 Markdown==3.1.1 MarkupSafe==1.1.1 matplotlib==3.1.1 mccabe==0.6.1 missinglink==19.9.26557 missinglink-kernel==19.9.26893 missinglink-sdk==19.9.26893 ml-core==19.9.3999 ml-crypto==0.7.811 ml-legit==19.9.8734 msgpack==0.6.2 msrest==0.6.10 msrestazure==0.6.2 mypy==0.711 mypy-extensions==0.4.1 natsort==6.0.0 netifaces==0.10.9 numpy==1.17.2 oauthlib==3.1.0 opt-einsum==2.3.2 pandas==0.25.1 patsy==0.5.1 pep8==1.7.1 Pillow==6.1.0 pip==19.2.3 ply==3.11 protobuf==3.8.0 psutil==5.6.3 puremagic==1.5 pyasn1==0.4.7 pyasn1-modules==0.2.6 pycparser==2.19 pycryptodome==3.6.6 Pygments==2.4.2 PyJWT==1.7.1 pylint==2.3.1 pyparsing==2.4.0 pyperclip==1.7.0 pypiwin32==223 python-dateutil==2.8.0 pytz==2019.2 pywin32==225 PyYAML==5.1.1 requests==2.22.0 requests-oauthlib==1.2.0 retrying==1.3.3 rope==0.14.0 rsa==4.0 s3transfer==0.2.1 scipy==1.3.0 sentry-sdk==0.11.2 setuptools==41.0.1 shellingham==1.3.1 six==1.12.0 smmap2==2.0.5 sseclient==0.0.24 statsmodels==0.10.1 tensorboard==2.0.0 tensorflow-estimator==2.0.0 tensorflow-gpu==2.0.0 termcolor==1.1.0 terminaltables==3.1.0 tqdm==4.32.2 typed-ast==1.4.0 urllib3==1.24.3 wcwidth==0.1.7 websocket-client==0.56.0 Werkzeug==0.16.0 wheel==0.33.4 wrapt==1.11.2 ``````Issue description
If I use many hparams (eg. 14) in tensorboard the table doenst display any results but the table head gets displayed correcly.
But when I delete some of the rows in
HPARAMS
section the row in the hparams table and the accuracy gets displayed correcly.