LSYS / forestplot

A Python package to make publication-ready but customizable coefficient plots.
http://forestplot.rtfd.io
MIT License
110 stars 10 forks source link

Not including first rows of dataset, row shifting, incorrectly annotating row as left-hand labels, whereas labels on the right are correct #81

Closed rmaarle closed 8 months ago

rmaarle commented 1 year ago

review_example.csv plot

import forestplot as fp
import pandas as pd

df = pd.read_csv("review_example.csv",sep=";")  # companion example data

fp.forestplot(df,  # the dataframe with results data
              estimate='PCSA_Men_mean',  # col containing estimated effect size 
              ll= 'PCSA_Men_Lower', hl='PCSA_Men_Upper',  # columns containing conf. int. lower and higher limits
              varlabel='Abbreviation',  # column containing variable label
              capitalize="capitalize",  # Capitalize labels
              annote=["Source", "Image modality", 'Sample_size',"Method", 'Position'],   # columns to report on left of plot
              annoteheaders=["Ref", "Modality", 'N',"PCSA", 'Pose'],  # ^corresponding headers
              rightannote=['Age', 'Height', 'Weight', 'Fiber_length', 'Pennation', "Info"],  # columns to report on right of plot 
              right_annoteheaders=['Age[y]', 'Height[cm]', 'Weight[kg]', 'Fiber_length[cm]', 'Pennation[Deg]', "Note"],  #corresponding headers

              groupvar= "Agegroup",  # column containing group labels
              group_order=["Reference","Young Adults","Adults"], 
              xlabel="PCSA Ratio",  # x-label title
              xticks=[0,30,60],  # x-ticks to be printed
              table=True,  # Format as a table
              color_alt_rows=True,  # Gray alternate rows
              # Additional kwargs for customizations
              **{"marker": "D",  # set maker symbol as diamond
                 "markersize": 35,  # adjust marker size
                 "xtick_size": 12,  # adjust x-ticker fontsize
                })
#plt.savefig("plot.jpg", bbox_inches="tight")
rmaarle commented 1 year ago

The example code with the sleep dataset worked perfectly, however when I implemented my own dataset various mistakes arose. I hope someone has a solution for this?

LSYS commented 9 months ago

hi @rmaarle, thanks for raising this. I wasn't aware that duplicated variable labels (varlabel) would create problems, which is likely the source of the problem. If you use some other unduplicated label, things should work as expected.

Minimal example:

import forestplot as fp
import pandas as pd

df = pd.read_csv("review_example.csv",sep=";")  # companion example data
df = df.reset_index().astype({"index": str})

fp.forestplot(df,  # the dataframe with results data
              estimate='PCSA_Men_mean',  # col containing estimated effect size 
              ll= 'PCSA_Men_Lower', hl='PCSA_Men_Upper',  # columns containing conf. int. lower and higher limits
              varlabel="index",
)

Your case (main change is varlabel=index):

import forestplot as fp
import pandas as pd

df = pd.read_csv("review_example.csv",sep=";")  # companion example data
df = df.reset_index().astype({"index": str})

fp.forestplot(df,  # the dataframe with results data
              estimate='PCSA_Men_mean',  # col containing estimated effect size 
              ll= 'PCSA_Men_Lower', hl='PCSA_Men_Upper',  # columns containing conf. int. lower and higher limits
              varlabel='index',  # column containing variable label
              capitalize="capitalize",  # Capitalize labels
              annote=["Source", "Image modality", 'Sample_size',"Method", 'Position'],   # columns to report on left of plot
              annoteheaders=["Ref", "Modality", 'N',"PCSA", 'Pose'],  # ^corresponding headers
              rightannote=['Age', 'Height', 'Weight', 'Fiber_length', 'Pennation',],  # columns to report on right of plot 
              right_annoteheaders=['Age[y]', 'Height[cm]', 'Weight[kg]', 'Fiber_length[cm]', 'Pennation[Deg]'],  #corresponding headers

              groupvar= "Agegroup",  # column containing group labels
              group_order=["Reference","Young Adults","Adults"], 
              xlabel="PCSA Ratio",  # x-label title
              xticks=[0,30,60],  # x-ticks to be printed
              table=True,  # Format as a table
              color_alt_rows=True,  # Gray alternate rows
              # Additional kwargs for customizations
              **{"marker": "D",  # set maker symbol as diamond
                 "markersize": 35,  # adjust marker size
                 "xtick_size": 12,  # adjust x-ticker fontsize
                }
)

image

LSYS commented 9 months ago

Your use case may find the future release (WIP) with grouped labels useful. The duplicated variable labels you were using were really groups. See #59 for an example.

LSYS commented 9 months ago

The next release will also warn about duplicated labels in the readme.