tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.73k stars 1.66k forks source link

Tensorboard has no content to display #6273

Closed FADEAWAY114 closed 1 year ago

FADEAWAY114 commented 1 year ago

When I run tensorboard, I get the following error. `^C(parking2) vmware@vmware-virtual-machine:~/epymarl/results/tb_logs/qmix_seed694755480_rware:rware-tiny-2ag-v1_2023-03-25 12:10:14.135647$ tensorboard --logdir="/home/vmware/epymarl/results/tb_logs/qmix_seed694iny-2ag-v1_2023-03-25 12:10:14.135647" 2023-03-26 12:17:20.887915: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784

Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.11.2 at http://localhost:6006/ (Press CTRL+C to quit) ` 51529264252ac18b5d5bcd7f5546e5b

yatbear commented 1 year ago

Hi @FADEAWAY114,

Looks like the log directory (passed to --logdir) doesn't contain any events files (with format of evebts.out.tfevents.xxxx), or the events files are empty.

To further troubleshoot, please run inspection command (more details):

tensorboard --inspect --logdir {your_logdir}

and paste the output in this issue. Thanks!

mohsenhrt commented 1 year ago

Hi I have this problem too and after run commend this is result:

tensorboard --inspect --logdir train_output/vits_fa_female-April-03-2023_03+13PM-0000000
2023-04-03 18:11:49.095361: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1938] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 5.0. CUDA ker
nels will be jit-compiled from PTX, which could take 30 minutes or longer.
======================================================================
Processing event files... (this can take a few minutes)
======================================================================

Found event files in:
train_output/vits_fa_female-April-03-2023_03+13PM-0000000

These tags are in train_output/vits_fa_female-April-03-2023_03+13PM-0000000:
audio
   TrainAudios/train/audio
histograms -
images
   TrainFigures/alignment
   TrainFigures/trainspectrogram/diff
   TrainFigures/trainspectrogram/fake
   TrainFigures/trainspectrogram/real
   TrainFigures/trainspeech_comparison
scalars
   TrainIterStats/current_lr_0
   TrainIterStats/current_lr_1
   TrainIterStats/grad_norm_0
   TrainIterStats/grad_norm_1
   TrainIterStats/loader_time
   TrainIterStats/loss_0
   TrainIterStats/loss_1
   TrainIterStats/loss_disc
   TrainIterStats/loss_disc_real_0
   TrainIterStats/loss_disc_real_1
   TrainIterStats/loss_disc_real_2
   TrainIterStats/loss_disc_real_3
   TrainIterStats/loss_disc_real_4
   TrainIterStats/loss_disc_real_5
   TrainIterStats/loss_duration
   TrainIterStats/loss_feat
   TrainIterStats/loss_gen
   TrainIterStats/loss_kl
   TrainIterStats/loss_mel
   TrainIterStats/step_time
tensor
   model-config/text_summary
   training-script/text_summary
======================================================================

Event statistics for train_output/vits_fa_female-April-03-2023_03+13PM-0000000:
audio
   first_step           550
   last_step            2200
   max_step             2200
   min_step             550
   num_steps            4
   outoforder_steps     []
graph -
histograms -
images
   first_step           550
   last_step            2200
   max_step             2200
   min_step             550
   num_steps            4
   outoforder_steps     []
scalars
   first_step           0
   last_step            2300
   max_step             2300
   min_step             0
   num_steps            24
   outoforder_steps     []
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor
   first_step           0
   last_step            0
   max_step             0
   min_step             0
   num_steps            1
   outoforder_steps     []
======================================================================

And below of Toggle button in tensorboard my previous log is active!(Not the current log that I need to see)

gn9ksb2o

yatbear commented 1 year ago

Hi @mohsenhrt,

Can you paste the command you used to start TensorBoard including the arg passed to --logdir? It looks like TensorBoard is pointing to file train_output/vits_fa_female-April-02-2023_05+44AM-0000000. Can you try passing the entire train_output directory to --logdir arg instead?

mohsenhrt commented 1 year ago

Thank you @yatbear I did that: image but still same error!It seems tensor can not reset itself. In first time I used tenorboard in same method and I had not any problem, everything was working well but after run another log file I get this error.

yatbear commented 1 year ago

Hi @mohsenhrt,

Can you share the full tensorboard command? I don't have Windows OS to reproduce your exact error, but I do see in this issue, it's recommended to cd into the directory to avoid passing prefixes likeC:\ to --logdir arg: https://github.com/tensorflow/tensorboard/issues/456#issuecomment-326696696. Could you please try this?

mohsenhrt commented 1 year ago

If you see again previous image, it shows that log directory is Log directory: C:\Users\core i7 2\PycharmProjects\vits\train_output\vits_fa_female-April-02-2023_05+44AM-0000000 why tensorboard does not reset log dir?!! my command:

C:\Users\core i7 2\PycharmProjects\vits\train_output>tensorboard --logdir="vits_fa_female-April-04-2023_06+40AM-0000000"
2023-04-04 07:26:45.933620: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1938] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 5.0. CUDA ker
nels will be jit-compiled from PTX, which could take 30 minutes or longer.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.6.0 at http://localhost:6006/ (Press CTRL+C to quit)
yatbear commented 1 year ago

Hi @mohsenhrt,

Could you try kill the current tensorboard process and restart a new one?

mohsenhrt commented 1 year ago

I killed it multi time by ctrl+C in terminal, still error ? another way is possible?

yatbear commented 1 year ago

Hi @mohsenhrt,

Could you try finding the PID of TensorBoard process and kill it as described in https://stackoverflow.com/questions/36896164/tensorflow-how-to-close-tensorboard-server?

mohsenhrt commented 1 year ago

I search all PID of TensorBoard and I kill them manually in anaconda and my IDE so TensorBoard now works well :).Thank you my friend, I hope that if you have something and I can help you, I can do it for you.