Open wchargin opened 3 years ago
I would like to share my experience about how to solve the problem about
Option --load_fast=true not available: TensorBoard data server not supported on this platform.
My OS is Ubuntu 18.04.5 LTS
, Python 3.10.14
and tensorboard 2.12.1
.
The problem is caused by the miss of binary of tensorboard-data-server. At first, I use pip
to install it:
pip install tensorboard-data-server
It runs successfully, but when I run following Python codes, it outputs None
:
import tensorboard_data_server
res = tensorboard_data_server.server_binary()
print(res)
It seems that the binary of tensorboard-data-server is not installed properly.
So I use conda
to install it like
conda install tensorboard
conda install chardet
After installation, I run the above Python codes again and it successfully outputs the path to binary of tensorboard-data-server like /home/<username>/miniconda3/envs/py310/lib/python3.10/site-packages/tensorboard_data_server/bin/server
It seems that pip
cannot install binary of tensorboard-data-server, but conda
can.
Finally, I can run tensorboard --logdir=<path/to/logdir> --load_fast=true
and it becomes much faster than before.
I'm using Chrome on Macbook, I get these errors:
The localhost page either shows nothing or empty grids when --load_fast=false
.
It shows the plots in the grids when --load_fast=false
but the error messages persist.
2024-04-16 13:41:07.757664: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2024-04-16 13:41:07.757723: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2024-04-16 13:41:07.801242: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-04-16 13:41:09.133758: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2024-04-16 13:41:09.133903: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2024-04-16 13:41:09.133924: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. 2024-04-16 13:41:11.219441: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.219624: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.219723: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223136: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223275: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223359: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223383: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
it is not usable with Google gcsfuse. See https://github.com/tensorflow/tensorboard/issues/6790
This thread is for tracking feedback about TensorBoard’s experimental mode for fast data loading. Typical speedups range from 100× to 400×.
Who should try this: Anyone who’s found TensorBoard’s data loading to be slower than they’d like.
Who shouldn’t try this: Windows users (for now).
Feedback: Feedback form, or reply on this thread.
Try it out
To try this out, please uninstall all copies of TensorBoard and then install the latest version of
tb-nightly
:Then, invoke TensorBoard with the
--load_fast=true
flag:Use TensorBoard as you usually would. It should work the same way, just faster.
Feedback
You can respond to this anonymous Google Form, or reply on this thread, or open a new issue. Let us know: did it work? how much faster was it? any suggestions or requests?
Known issues
We know about these, but please let us know if they matter for you, so that we can prioritize working on them:
FAQ
What does “data loading” include?
It includes time spent reading files in your logdir. It does not include time spent painting charts on the frontend.
What is the
--load_fast
flag?Pass
--load_fast=true
to tell TensorBoard to use a new data loading mechanism, which is generally hundreds of times faster.Is
--load_fast=true
right for me?Currently, this mode is supported on Linux and macOS. If you are interested in using it on other platforms, ping @wchargin and I’ll show you how to build it.
Most features of TensorBoard are expected to work with the new data loading mechanism. All standard TensorBoard dashboards (scalars, images, etc.) should work, and flags like
--reload_interval
should work, too. You can use logdirs on local disk or on GCS buckets (public or private).Do I need to have TensorFlow installed?
No.
What’s happening under the hood?
Instead of crawling your logdir in a mixture of Python and C++ code with a lot of locking, cross-language marshalling, and slow data manipulation in Python, we read the data in a dedicated subprocess. This program is written in Rust and is optimized for concurrent reading and serving. More design details here.