tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.71k stars 1.66k forks source link

Fast data loading feedback (`--load_fast=true`; “RustBoard”) #4784

Open wchargin opened 3 years ago

wchargin commented 3 years ago

This thread is for tracking feedback about TensorBoard’s experimental mode for fast data loading. Typical speedups range from 100× to 400×.

Who should try this: Anyone who’s found TensorBoard’s data loading to be slower than they’d like.

Who shouldn’t try this: Windows users (for now).

Feedback: Feedback form, or reply on this thread.

Try it out

To try this out, please uninstall all copies of TensorBoard and then install the latest version of tb-nightly:

pip uninstall -y tensorboard tb-nightly &&
pip install tb-nightly  # must have at least tb-nightly==2.5.0a20210316

Then, invoke TensorBoard with the --load_fast=true flag:

tensorboard --logdir /path/to/logs --load_fast true

Use TensorBoard as you usually would. It should work the same way, just faster.

Feedback

You can respond to this anonymous Google Form, or reply on this thread, or open a new issue. Let us know: did it work? how much faster was it? any suggestions or requests?

Known issues

We know about these, but please let us know if they matter for you, so that we can prioritize working on them:

FAQ

What does “data loading” include?

It includes time spent reading files in your logdir. It does not include time spent painting charts on the frontend.

What is the --load_fast flag?

Pass --load_fast=true to tell TensorBoard to use a new data loading mechanism, which is generally hundreds of times faster.

Is --load_fast=true right for me?

Currently, this mode is supported on Linux and macOS. If you are interested in using it on other platforms, ping @wchargin and I’ll show you how to build it.

Most features of TensorBoard are expected to work with the new data loading mechanism. All standard TensorBoard dashboards (scalars, images, etc.) should work, and flags like --reload_interval should work, too. You can use logdirs on local disk or on GCS buckets (public or private).

Do I need to have TensorFlow installed?

No.

What’s happening under the hood?

Instead of crawling your logdir in a mixture of Python and C++ code with a lot of locking, cross-language marshalling, and slow data manipulation in Python, we read the data in a dedicated subprocess. This program is written in Rust and is optimized for concurrent reading and serving. More design details here.

zerzerzerz commented 7 months ago

I would like to share my experience about how to solve the problem about

Option --load_fast=true not available: TensorBoard data server not supported on this platform.

My OS is Ubuntu 18.04.5 LTS, Python 3.10.14 and tensorboard 2.12.1. The problem is caused by the miss of binary of tensorboard-data-server. At first, I use pip to install it:

pip install tensorboard-data-server

It runs successfully, but when I run following Python codes, it outputs None:

import tensorboard_data_server

res = tensorboard_data_server.server_binary()
print(res)

It seems that the binary of tensorboard-data-server is not installed properly. So I use conda to install it like

conda install tensorboard
conda install chardet

After installation, I run the above Python codes again and it successfully outputs the path to binary of tensorboard-data-server like /home/<username>/miniconda3/envs/py310/lib/python3.10/site-packages/tensorboard_data_server/bin/server It seems that pip cannot install binary of tensorboard-data-server, but conda can.

Finally, I can run tensorboard --logdir=<path/to/logdir> --load_fast=true and it becomes much faster than before.

valerie-lth commented 7 months ago

I'm using Chrome on Macbook, I get these errors:

image image

The localhost page either shows nothing or empty grids when --load_fast=false.

It shows the plots in the grids when --load_fast=false but the error messages persist.

rajnish159 commented 6 months ago

2024-04-16 13:41:07.757664: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2024-04-16 13:41:07.757723: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2024-04-16 13:41:07.801242: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-04-16 13:41:09.133758: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2024-04-16 13:41:09.133903: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2024-04-16 13:41:09.133924: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. 2024-04-16 13:41:11.219441: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.219624: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.219723: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223136: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223275: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223359: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory 2024-04-16 13:41:11.223383: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.

bhack commented 2 months ago

it is not usable with Google gcsfuse. See https://github.com/tensorflow/tensorboard/issues/6790