Overly large font size (or other elements) causing `ptable_heatmap` tiles to shrink

CompRhys commented 4 months ago

Recent changes appear to have broken the matplotlib heatmaps. @DanielYang59

DanielYang59 commented 4 months ago

Thanks for reporting, and sorry for the trouble. Looks like bad font size to me (the default font size is too large in scientific notation mode, you might want to manually reduce the font size a bit for now, and certainly it's better for me to improve the default font size in such situation), can you provide the code to reproduce that plot? @CompRhys

CompRhys commented 3 months ago

Sorry for the slow reply, if you check out this branch (https://github.com/janosh/matbench-discovery/pull/125) and run the eda_wbm.py file you will also see similar graphical issues. This is a different example to the one above but shows it does crop up commonly

DanielYang59 commented 3 months ago

No problem I would look into this ASAP

DanielYang59 commented 3 months ago

I'm able to reproduce similar issues with the following code extracted from the eda_wbm.py (should be bad font size/total number of digits):

import pandas as pd
import pymatviz as pmv
from matplotlib.colors import SymLogNorm
from pymatviz.enums import Key

from matbench_discovery import PDF_FIGS, ROOT
from matbench_discovery import plots as plots
from matbench_discovery.data import DataFiles, df_wbm

DATA_PAGE = f"{ROOT}/site/src/routes/data"

# Load MP training set
df_mp = pd.read_csv(DataFiles.mp_energies.path, na_filter=False, na_values=[])
df_mp = df_mp.rename(columns={"formula_pretty": Key.formula})
df_mp.loc[
    df_mp[Key.mat_id].isin(["mp-1080032", "mp-1179882", "mp-1009221"]), Key.formula
] = "NaN"
df_mp[df_mp[Key.formula].isna() | (df_mp[Key.formula] == "")]

wbm_occu_counts = pmv.count_elements(df_wbm[Key.formula], count_mode="occurrence")
wbm_comp_counts = pmv.count_elements(df_wbm[Key.formula], count_mode="composition")
mp_occu_counts = pmv.count_elements(df_mp[Key.formula], count_mode="occurrence")
mp_comp_counts = pmv.count_elements(df_mp[Key.formula], count_mode="composition")

all_counts = (
    ("wbm", "occurrence", wbm_occu_counts),
    ("wbm", "composition", wbm_comp_counts),
    ("mp", "occurrence", mp_occu_counts),
    ("mp", "composition", mp_comp_counts),
)

for dataset, count_mode, elem_counts in all_counts:
    filename = f"{dataset}-element-counts-by-{count_mode}"
    elem_counts.to_json(f"{DATA_PAGE}/{filename}.json")

    ax_mp_cnt = pmv.ptable_heatmap(
        elem_counts,
        value_fmt=lambda x, _: pmv.utils.si_fmt(x, ".0f"),
        cbar_label_fmt=lambda x, _: pmv.utils.si_fmt(x, ".0f"),
        cbar_title=f"{dataset.upper()} Element Count",
        log=(log := SymLogNorm(linthresh=100)),
        return_type="figure",
    )
    if log:
        filename += "-symlog" if isinstance(log, SymLogNorm) else "-log"
    pmv.save_fig(ax_mp_cnt, f"{PDF_FIGS}/{filename}.pdf")

DanielYang59 commented 2 months ago

Closing as I believe it is resolved by https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2333118569, free feel to comment if it's not the case.

janosh commented 2 months ago

@DanielYang59 thanks for the advice in https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2333118569! i haven't tried it yet but if those custom kwargs fix the issue, we should make those values the new defaults.

DanielYang59 commented 2 months ago

if those custom kwargs fix the issue, we should make those values the new defaults.

I would reply in that PR :)

DanielYang59 commented 2 months ago

I believe this could be closed https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2338235109

janosh commented 2 months ago

the issue of shrinking element tiles does still happen for large heat value numbers or if you pass a custom fmt string like .5f. whenever there are too many digits to fit in the element tile, the tile shrinks. ideally, it should keep its size, no matter the content.

DanielYang59 commented 2 months ago

whenever there are too many digits to fit in the element tile, the tile shrinks

Yep I noticed this issue too (not just heat value, but any overly large font).

The way tile size is handled is a bit complicated and confusing, looks like when the font is overly large, the tile shrinks (like taking the max(figure_size, font_size)).

I assume any element that is larger than the tile size would cause the tile to shrink, perhaps it's best for us to somehow normalize the fontsize (and more).

janosh / pymatviz

Overly large font size (or other elements) causing `ptable_heatmap` tiles to shrink #188