LSYS / forestplot

A Python package to make publication-ready but customizable coefficient plots.
http://forestplot.rtfd.io
MIT License
112 stars 10 forks source link

Known Issue: Table headers don't work as expected with 6 (or fewer) rows of data #52

Closed LSYS closed 3 months ago

LSYS commented 1 year ago

See issue48-table-does-not-work-6rows-or-fewer.ipynb.

df = fp.load_data("sleep")

fp.forestplot(df.head(6),  # the dataframe with results data
              estimate="r",  # col containing estimated effect size 
              ll="ll", hl="hl",  # lower & higher limits of conf. int.
              varlabel="label",  # column containing the varlabels to be printed on far left
              capitalize="capitalize",  # Capitalize labels
              pval="p-val",  # column containing p-values to be formatted
              annote=["n", "power", "est_ci"],  # columns to report on left of plot
              annoteheaders=["N", "Power", "Est. (95% Conf. Int.)"],  # ^corresponding headers
              rightannote=["formatted_pval", "group"],  # columns to report on right of plot 
              right_annoteheaders=["P-value", "Variable group"],  # ^corresponding headers
              xlabel="Pearson correlation coefficient",  # x-label title
              table=True,  # Format as a table
              )

image

jeanbaptisteb commented 8 months ago

I've encountered the same exact issue today, so I've been investigating it a bit to find a temporary workaround, but without success so far.

However, here are some elements if it can help someone else investigating the problem too:

After generating a forestplot (filling the "annote" and "annoteheaders" parameters, and setting the "table" parameter to True), I stored it in a fp variable. From fp, I can access the content of the "invisible" column labels through this line of code:

fp.yaxis.get_majorticklabels()[-1]

So the column labels are not really deleted from the plot, they are just invisible for some reason.

I've ruled out things like incorrect font color (e.g. white on white) or incorrect font size (e.g. very small font sizes). I've also been trying to tweak parameters like zorder or set_visible, without success.

My hypotheses so far is that either the labels are hidden behind another matplotlib element that takes precedence over them, or they are plotted in a position outside the plot (even though fp.yaxis.get_majorticklabels()[-1].get_position() doesn't seem to indicate this, when we compare to a plot without this "invisible labels" problem).

Hopefully this might be helpful to fix the issue.

jeanbaptisteb commented 8 months ago

I just found a quick and (very dirty) workaround, not sure if it will work in all scenarios, but here it is:

Once you stored the forest plot if a fp variable, simply increase the upper limit of the y axis with the following line of code:

offset=0.3
fp.set_ylim((fp.get_ylim()[0], fp.get_ylim()[1]+offset))
plt.show()

You may have to try different values for offset, depending on the situation.

As for a longer-term fix, it looks like the issue is with the labels being cropped by a too short upper limit on the y axis, when the dataframe has too few rows. I have absolutely no idea where this problem occurs in the codebase, but I'll try to investigate it further if I have some time on my hands one of these days (not sure it will happen though :( ).

jeanbaptisteb commented 8 months ago

@LSYS Ah, I just saw the abandoned status on this bug. If in some hypothetical future I submit a pull request to fix the issue, will you accept it? (I don't want to spend too much time on it for nothing!)

jeanbaptisteb commented 8 months ago

Continuing some testing on my local machine, in https://github.com/LSYS/forestplot/blob/04b8858d62597bd1f1b5d4085dde875c890f7eb9/forestplot/plot.py#L537, setting negative_padding always to 0.5, that is replacing

if annoteheaders or right_annoteheaders:
     negative_padding = 1.0
else:
     negative_padding = 0.5

by negative_padding = 0.5

solves the issue in the example given in https://github.com/LSYS/forestplot/blob/main/docs/assets/issue52-table-does-not-work-6rows-or-fewer.ipynb

So that's surely negative_padding the culprit here. Fixing the issue would require to check possible side-effects of keeping negative_padding constant.

LSYS commented 8 months ago

@jeanbaptisteb give me some time to look at this. But absolutely, if you submit a PR to fix it with tests passing (plus other example figures all looking as expected), it should be accepted.

LSYS commented 3 months ago

@jeanbaptisteb i did a quick test and seems to work, do you want to make a PR for this?

jeanbaptisteb commented 3 months ago

@LSYS I'd prefer to make sure that the fix I suggested does not break something else in the process. Hopefully this week I should have some time on my hands to do a bit of extensive testing. I'll let you know that by the end of the week, but if you have no news from me by friday, it's likely that I won't have the time to look into it (in this case, feel free to add yourself my suggested fix).

LSYS commented 3 months ago

@jeanbaptisteb got it, thanks!