janosh / pymatviz

A toolkit for visualizations in materials informatics.
https://janosh.github.io/pymatviz
MIT License
156 stars 13 forks source link

Bad font size breaks `ptable_heatmap` #188

Open CompRhys opened 1 month ago

CompRhys commented 1 month ago

image

Recent changes appear to have broken the matplotlib heatmaps. @DanielYang59

DanielYang59 commented 1 month ago

Thanks for reporting, and sorry for the trouble. Looks like bad font size to me (the default font size is too large in scientific notation mode, you might want to manually reduce the font size a bit for now, and certainly it's better for me to improve the default font size in such situation), can you provide the code to reproduce that plot? @CompRhys

CompRhys commented 3 weeks ago

Sorry for the slow reply, if you check out this branch (https://github.com/janosh/matbench-discovery/pull/125) and run the eda_wbm.py file you will also see similar graphical issues. This is a different example to the one above but shows it does crop up commonly

DanielYang59 commented 3 weeks ago

No problem I would look into this ASAP

DanielYang59 commented 3 weeks ago

I'm able to reproduce similar issues with the following code extracted from the eda_wbm.py (should be bad font size/total number of digits):

import pandas as pd
import pymatviz as pmv
from matplotlib.colors import SymLogNorm
from pymatviz.enums import Key

from matbench_discovery import PDF_FIGS, ROOT
from matbench_discovery import plots as plots
from matbench_discovery.data import DataFiles, df_wbm

DATA_PAGE = f"{ROOT}/site/src/routes/data"

# Load MP training set
df_mp = pd.read_csv(DataFiles.mp_energies.path, na_filter=False, na_values=[])
df_mp = df_mp.rename(columns={"formula_pretty": Key.formula})
df_mp.loc[
    df_mp[Key.mat_id].isin(["mp-1080032", "mp-1179882", "mp-1009221"]), Key.formula
] = "NaN"
df_mp[df_mp[Key.formula].isna() | (df_mp[Key.formula] == "")]

wbm_occu_counts = pmv.count_elements(df_wbm[Key.formula], count_mode="occurrence")
wbm_comp_counts = pmv.count_elements(df_wbm[Key.formula], count_mode="composition")
mp_occu_counts = pmv.count_elements(df_mp[Key.formula], count_mode="occurrence")
mp_comp_counts = pmv.count_elements(df_mp[Key.formula], count_mode="composition")

all_counts = (
    ("wbm", "occurrence", wbm_occu_counts),
    ("wbm", "composition", wbm_comp_counts),
    ("mp", "occurrence", mp_occu_counts),
    ("mp", "composition", mp_comp_counts),
)

for dataset, count_mode, elem_counts in all_counts:
    filename = f"{dataset}-element-counts-by-{count_mode}"
    elem_counts.to_json(f"{DATA_PAGE}/{filename}.json")

    ax_mp_cnt = pmv.ptable_heatmap(
        elem_counts,
        value_fmt=lambda x, _: pmv.utils.si_fmt(x, ".0f"),
        cbar_label_fmt=lambda x, _: pmv.utils.si_fmt(x, ".0f"),
        cbar_title=f"{dataset.upper()} Element Count",
        log=(log := SymLogNorm(linthresh=100)),
        return_type="figure",
    )
    if log:
        filename += "-symlog" if isinstance(log, SymLogNorm) else "-log"
    pmv.save_fig(ax_mp_cnt, f"{PDF_FIGS}/{filename}.pdf")
DanielYang59 commented 1 week ago

Closing as I believe it is resolved by https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2333118569, free feel to comment if it's not the case.

janosh commented 1 week ago

@DanielYang59 thanks for the advice in https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2333118569! i haven't tried it yet but if those custom kwargs fix the issue, we should make those values the new defaults.

DanielYang59 commented 1 week ago

if those custom kwargs fix the issue, we should make those values the new defaults.

I would reply in that PR :)

DanielYang59 commented 3 days ago

I believe this could be closed https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2338235109

janosh commented 3 days ago

the issue of shrinking element tiles does still happen for large heat value numbers or if you pass a custom fmt string like .5f. whenever there are too many digits to fit in the element tile, the tile shrinks. ideally, it should keep its size, no matter the content.

DanielYang59 commented 3 days ago

whenever there are too many digits to fit in the element tile, the tile shrinks

Yep I noticed this issue too (not just heat value, but any overly large font).

The way tile size is handled is a bit complicated and confusing, looks like when the font is overly large, the tile shrinks (like taking the max(figure_size, font_size)).

I assume any element that is larger than the tile size would cause the tile to shrink, perhaps it's best for us to somehow normalize the fontsize (and more).