Open CompRhys opened 4 months ago
Thanks for reporting, and sorry for the trouble. Looks like bad font size to me (the default font size is too large in scientific notation mode, you might want to manually reduce the font size a bit for now, and certainly it's better for me to improve the default font size in such situation), can you provide the code to reproduce that plot? @CompRhys
Sorry for the slow reply, if you check out this branch (https://github.com/janosh/matbench-discovery/pull/125) and run the eda_wbm.py
file you will also see similar graphical issues. This is a different example to the one above but shows it does crop up commonly
No problem I would look into this ASAP
I'm able to reproduce similar issues with the following code extracted from the eda_wbm.py
(should be bad font size/total number of digits):
import pandas as pd
import pymatviz as pmv
from matplotlib.colors import SymLogNorm
from pymatviz.enums import Key
from matbench_discovery import PDF_FIGS, ROOT
from matbench_discovery import plots as plots
from matbench_discovery.data import DataFiles, df_wbm
DATA_PAGE = f"{ROOT}/site/src/routes/data"
# Load MP training set
df_mp = pd.read_csv(DataFiles.mp_energies.path, na_filter=False, na_values=[])
df_mp = df_mp.rename(columns={"formula_pretty": Key.formula})
df_mp.loc[
df_mp[Key.mat_id].isin(["mp-1080032", "mp-1179882", "mp-1009221"]), Key.formula
] = "NaN"
df_mp[df_mp[Key.formula].isna() | (df_mp[Key.formula] == "")]
wbm_occu_counts = pmv.count_elements(df_wbm[Key.formula], count_mode="occurrence")
wbm_comp_counts = pmv.count_elements(df_wbm[Key.formula], count_mode="composition")
mp_occu_counts = pmv.count_elements(df_mp[Key.formula], count_mode="occurrence")
mp_comp_counts = pmv.count_elements(df_mp[Key.formula], count_mode="composition")
all_counts = (
("wbm", "occurrence", wbm_occu_counts),
("wbm", "composition", wbm_comp_counts),
("mp", "occurrence", mp_occu_counts),
("mp", "composition", mp_comp_counts),
)
for dataset, count_mode, elem_counts in all_counts:
filename = f"{dataset}-element-counts-by-{count_mode}"
elem_counts.to_json(f"{DATA_PAGE}/{filename}.json")
ax_mp_cnt = pmv.ptable_heatmap(
elem_counts,
value_fmt=lambda x, _: pmv.utils.si_fmt(x, ".0f"),
cbar_label_fmt=lambda x, _: pmv.utils.si_fmt(x, ".0f"),
cbar_title=f"{dataset.upper()} Element Count",
log=(log := SymLogNorm(linthresh=100)),
return_type="figure",
)
if log:
filename += "-symlog" if isinstance(log, SymLogNorm) else "-log"
pmv.save_fig(ax_mp_cnt, f"{PDF_FIGS}/{filename}.pdf")
Closing as I believe it is resolved by https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2333118569, free feel to comment if it's not the case.
@DanielYang59 thanks for the advice in https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2333118569! i haven't tried it yet but if those custom kwargs fix the issue, we should make those values the new defaults.
if those custom kwargs fix the issue, we should make those values the new defaults.
I would reply in that PR :)
I believe this could be closed https://github.com/janosh/matbench-discovery/pull/125#issuecomment-2338235109
the issue of shrinking element tiles does still happen for large heat value numbers or if you pass a custom fmt
string like .5f
. whenever there are too many digits to fit in the element tile, the tile shrinks. ideally, it should keep its size, no matter the content.
whenever there are too many digits to fit in the element tile, the tile shrinks
Yep I noticed this issue too (not just heat value, but any overly large font).
The way tile size is handled is a bit complicated and confusing, looks like when the font is overly large, the tile shrinks (like taking the max(figure_size, font_size)
).
I assume any element that is larger than the tile size would cause the tile to shrink, perhaps it's best for us to somehow normalize the fontsize (and more).
Recent changes appear to have broken the matplotlib heatmaps. @DanielYang59