tensorflow / tensorboard

TensorFlow's Visualization Toolkit
Apache License 2.0
6.71k stars 1.66k forks source link

Help interpreting distributions and histograms generated in tensorboard #6411

Closed pyjm closed 1 year ago

pyjm commented 1 year ago

Hello tensorflow team 😄

I am struggling to interpret the histogram and distribution graphs generated in tensorborad (please see the file attached: Image_tensorflow.pptx)

I am working on a microbiome data (16S rRNA gene sequencing) and performing a multinomial regression analysis to evaluate the effect of several variables on the microbiome composition of my tissue.

The output of this analysis is a file with the logarithm of the fold change in abundance of a taxa between two conditions.

I am using tensorboard to evaluate three models (please see figures attached):

1) A null model (blue in the figures): representing random chance. This model was made without any metadata. 2) Model with only one independent variable (red in the figures). 3) A complete model with several variables likely to affect the microbiome (label as orange).

It's my understanding that the Accuracy graph reveals the capacity of each model to predict the observed read counts in samples. Please note that the full model performs better than the other two. I also understand the Loss graph, which represents how well each model fits the data. The y-axis is the negative log probability of the model fitting the data (the lower the number indicates better fit of the model). The full model (orange line) is closer to 0, indicating better fit.

I am having a hard time interpreting the Distribution and Histogram images?

I would really appreciate to hear your thoughts on how to interpret the Distribution and Histogram images.

Thank you very much!!

arcra commented 1 year ago

Hello @pyjm,

TensorBoard does not log/track any particular metric by itself, but rather, your training script specifies what data is logged, which can then be visualized on TensorBoard. You would need to take a look at what and how you're logging the data for those charts, in your training script.

For guidance on how the distribution and histogram charts are used, you can take a look at the descriptions for the charts, available in our README file and main GitHub page, and the histogram data type from the summary API: https://github.com/tensorflow/tensorboard#histogram-dashboard https://github.com/tensorflow/tensorboard#distribution-dashboard https://www.tensorflow.org/api_docs/python/tf/summary/histogram

I hope those resources help you understand. Feel free to ask further questions if those concepts are not clear, but I'm afraid that beyond that, we don't really have the context to help interpret your data.