When attempting to render a model card featuring a histogram with very long text (label) and numerous items, I'm facing challenges in effectively visualizing the data.
For instance, let's consider the scenario where I render some string statistics, containing 50 lorem ipsum buckets with numbering, resulting in a model card like the one shown below.
The labels are overlapped, and difficult to read.
What is the expected behavior?
Clearer plots.
I think it would be beneficial for the model-card-toolkit to limit on the number of words and items for histogram labels when generating histogram plots.
from tensorflow_metadata.proto.v0 import statistics_pb2
from model_card_toolkit import model_card
from model_card_toolkit.utils import tf_graphics
lorem = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed efficitur, enim sit amet ultrices malesuada, lorem augue rhoncus quam, sit amet ullamcorper dolor ligula quis est. Sed tempor blandit pharetra. Aenean facilisis eu lacus non molestie. Sed enim turpis, semper vel gravida sed, egestas at lacus. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Aliquam at libero posuere, dapibus tellus at, aliquet ipsum. Fusce quis ante nec neque interdum mollis mattis vitae ante. Curabitur aliquet enim enim, ac porttitor nibh lobortis nec. Nam id gravida ex. Donec mi magna, fermentum ac pulvinar vitae, cursus vel odio."
feature = statistics_pb2.FeatureNameStatistics()
feature.path.step.extend("string_feature")
feature.type = statistics_pb2.FeatureNameStatistics.STRING
for i in range(50):
bucket = feature.string_stats.rank_histogram.buckets.add()
bucket.label = f"{lorem} {i}"
bucket.sample_count = 1000 + i * 100
feature_stats = statistics_pb2.DatasetFeatureStatistics()
feature_stats.features.add().CopyFrom(feature)
datasets = statistics_pb2.DatasetFeatureStatisticsList()
datasets.datasets.add().CopyFrom(feature_stats)
mc = model_card.ModelCard()
tf_graphics.annotate_dataset_feature_statistics_plots(
mc, [datasets]
)
mc.render(
template_path="model_card_toolkit/template/html/default_template.html.jinja",
output_path="sample/model_card.html"
)
What happened?
When attempting to render a model card featuring a histogram with very long text (label) and numerous items, I'm facing challenges in effectively visualizing the data.
For instance, let's consider the scenario where I render some string statistics, containing 50 lorem ipsum buckets with numbering, resulting in a model card like the one shown below.
The labels are overlapped, and difficult to read.
What is the expected behavior?
Clearer plots.
I think it would be beneficial for the model-card-toolkit to limit on the number of words and items for histogram labels when generating histogram plots.
https://github.com/tensorflow/model-card-toolkit/blob/74d7e6d8d3163b830711b226491ccd976a2d7018/model_card_toolkit/utils/graphics.py#L52-L91
How can we reproduce the problem?
run following code to rerender previous image
Model Card Toolkit Version
2.0.0
Python Version
3.8.10
Platforms
docker
Relevant log output
No response