Closed FADEAWAY114 closed 1 year ago
Hi @FADEAWAY114,
Looks like the log directory (passed to --logdir
) doesn't contain any events
files (with format of evebts.out.tfevents.xxxx
), or the events
files are empty.
To further troubleshoot, please run inspection command (more details):
tensorboard --inspect --logdir {your_logdir}
and paste the output in this issue. Thanks!
Hi I have this problem too and after run commend this is result:
tensorboard --inspect --logdir train_output/vits_fa_female-April-03-2023_03+13PM-0000000
2023-04-03 18:11:49.095361: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1938] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 5.0. CUDA ker
nels will be jit-compiled from PTX, which could take 30 minutes or longer.
======================================================================
Processing event files... (this can take a few minutes)
======================================================================
Found event files in:
train_output/vits_fa_female-April-03-2023_03+13PM-0000000
These tags are in train_output/vits_fa_female-April-03-2023_03+13PM-0000000:
audio
TrainAudios/train/audio
histograms -
images
TrainFigures/alignment
TrainFigures/trainspectrogram/diff
TrainFigures/trainspectrogram/fake
TrainFigures/trainspectrogram/real
TrainFigures/trainspeech_comparison
scalars
TrainIterStats/current_lr_0
TrainIterStats/current_lr_1
TrainIterStats/grad_norm_0
TrainIterStats/grad_norm_1
TrainIterStats/loader_time
TrainIterStats/loss_0
TrainIterStats/loss_1
TrainIterStats/loss_disc
TrainIterStats/loss_disc_real_0
TrainIterStats/loss_disc_real_1
TrainIterStats/loss_disc_real_2
TrainIterStats/loss_disc_real_3
TrainIterStats/loss_disc_real_4
TrainIterStats/loss_disc_real_5
TrainIterStats/loss_duration
TrainIterStats/loss_feat
TrainIterStats/loss_gen
TrainIterStats/loss_kl
TrainIterStats/loss_mel
TrainIterStats/step_time
tensor
model-config/text_summary
training-script/text_summary
======================================================================
Event statistics for train_output/vits_fa_female-April-03-2023_03+13PM-0000000:
audio
first_step 550
last_step 2200
max_step 2200
min_step 550
num_steps 4
outoforder_steps []
graph -
histograms -
images
first_step 550
last_step 2200
max_step 2200
min_step 550
num_steps 4
outoforder_steps []
scalars
first_step 0
last_step 2300
max_step 2300
min_step 0
num_steps 24
outoforder_steps []
sessionlog:checkpoint -
sessionlog:start -
sessionlog:stop -
tensor
first_step 0
last_step 0
max_step 0
min_step 0
num_steps 1
outoforder_steps []
======================================================================
And below of Toggle button in tensorboard my previous log is active!(Not the current log that I need to see)
Hi @mohsenhrt,
Can you paste the command you used to start TensorBoard including the arg passed to --logdir
? It looks like TensorBoard is pointing to file train_output/vits_fa_female-April-02-2023_05+44AM-0000000
. Can you try passing the entire train_output
directory to --logdir
arg instead?
Thank you @yatbear I did that: but still same error!It seems tensor can not reset itself. In first time I used tenorboard in same method and I had not any problem, everything was working well but after run another log file I get this error.
Hi @mohsenhrt,
Can you share the full tensorboard
command? I don't have Windows OS to reproduce your exact error, but I do see in this issue, it's recommended to cd
into the directory to avoid passing prefixes likeC:\
to --logdir
arg: https://github.com/tensorflow/tensorboard/issues/456#issuecomment-326696696. Could you please try this?
If you see again previous image, it shows that log directory is
Log directory: C:\Users\core i7 2\PycharmProjects\vits\train_output\vits_fa_female-April-02-2023_05+44AM-0000000
why tensorboard does not reset log dir?!!
my command:
C:\Users\core i7 2\PycharmProjects\vits\train_output>tensorboard --logdir="vits_fa_female-April-04-2023_06+40AM-0000000"
2023-04-04 07:26:45.933620: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1938] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 5.0. CUDA ker
nels will be jit-compiled from PTX, which could take 30 minutes or longer.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.6.0 at http://localhost:6006/ (Press CTRL+C to quit)
Hi @mohsenhrt,
Could you try kill the current tensorboard process and restart a new one?
I killed it multi time by ctrl+C in terminal, still error ? another way is possible?
Hi @mohsenhrt,
Could you try finding the PID of TensorBoard process and kill it as described in https://stackoverflow.com/questions/36896164/tensorflow-how-to-close-tensorboard-server?
I search all PID of TensorBoard and I kill them manually in anaconda and my IDE so TensorBoard now works well :).Thank you my friend, I hope that if you have something and I can help you, I can do it for you.
When I run tensorboard, I get the following error. `^C(parking2) vmware@vmware-virtual-machine:~/epymarl/results/tb_logs/qmix_seed694755480_rware:rware-tiny-2ag-v1_2023-03-25 12:10:14.135647$ tensorboard --logdir="/home/vmware/epymarl/results/tb_logs/qmix_seed694iny-2ag-v1_2023-03-25 12:10:14.135647" 2023-03-26 12:17:20.887915: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.11.2 at http://localhost:6006/ (Press CTRL+C to quit) `