tompollard / tableone

Create "Table 1" for research papers in Python
https://pypi.python.org/pypi/tableone/
MIT License
161 stars 38 forks source link

Sorting for pandas category dtype #83

Closed gothmania closed 4 years ago

gothmania commented 5 years ago

I'm creating table one for some ordered panda category columns (pandas.Categorical). Can you provide an option to sort the results in the order of levels, not in the alphabet order? It would be best if we could sort both the result rows and (groupby) columns.

For example: levels disagree < neutral < agree

At the moment:

                     overall
variable  level
some_var  agree      XX (XX)
          disagree   XX (XX)
          neutral    XX (XX)

It should be:

                     overall
variable  level
some_var  disagree   XX (XX)
          neutral    XX (XX)
          agree      XX (XX)

Or when grouped:

                   disagree    neutral    agree
variable level
some_var           XX (XX)     XX (XX)    XX (XX)

It's just the matter of appearance, but I think the table will be more ready to publish. At the moment I have to rearrange the rows and columns manually.

Thanks.

tompollard commented 5 years ago

Thanks, we'll look into adding this sorting functionality when we have the opportunity.

tompollard commented 4 years ago

It is now (from v0.6.6) possible to specify custom ordering for categorical variables using the order argument. Example below:

# import libraries
from tableone import TableOne
import pandas as pd

# load sample data into a pandas dataframe
url="https://raw.githubusercontent.com/tompollard/tableone/master/data/pn2012_demo.csv"
data=pd.read_csv(url)

# columns to summarize
columns = ['SysABP', 'ICU', 'Height', 'Weight']

# columns containing categorical variables
categorical = ['ICU']

order = {'ICU': ['SICU', 'MICU', 'CSRU', 'CCU']}

# create tableone with the input arguments
mytable = TableOne(data, columns=columns, categorical=categorical, 
                   label_suffix=True, groupby = 'death', order=order)

print(mytable.tabulate(tablefmt = "github"))
Missing 0 1
n 864 136
SysABP, mean (SD) 291 115.4 (38.3) 107.6 (49.4)
ICU, n (%) SICU 0 215 (24.9) 41 (30.1)
MICU 318 (36.8) 62 (45.6)
CSRU 194 (22.5) 8 (5.9)
CCU 137 (15.9) 25 (18.4)
Height, mean (SD) 475 170.3 (23.2) 168.5 (11.3)
Weight, mean (SD) 302 83.0 (23.6) 82.3 (25.4)
gothmania commented 4 years ago

This is of extremely great help to me. Thank you so much, Tom.

tompollard commented 4 years ago

thanks @gothmania, glad it helps