Option to display data for binary variables in only a single row

christian-geier commented 6 years ago

For variables that are binary/boolean such as smoker (Y/N), ascites (Y/N), ... etc. I currently enter it as a category. This results in two rows:

Variable	Group1 n=10	Group2 n=10
Smoker	0	5 (25%)
	1	15 (75%)
Ascites	0	17 (85%)
	1	3 (15%)

etc...

The desired output would only include the positive features:

Variable	Group1 n=10	Group2 n=10
Smoker	1	15 (75%)
Ascites	1	3 (15%)

The reason being, if there are only two groups and they are mutually exclusive, it is very easy (in fact may be easier) for the reader to understand the data with a single row

tompollard commented 6 years ago

Also requested by @theonesp - we'll try to implement soon!

tompollard commented 4 years ago

Sorry for the delay in getting to this. The limit and order arguments can now (from v0.6.6) be combined to display binary data a single row:

Specify a limit of 1 in the limit argument (e.g. limit = {"Ascites": 1}, so only a single value is displayed for the categorical variable.
Ensure the desired value is displayed by specifying an order. As we are only showing the first value, there's no need to list more than one item for the order (e.g. order = {"Ascites": ["1"]}

Example below:

# import libraries
from tableone import TableOne
import pandas as pd

# load sample data into a pandas dataframe
url="https://raw.githubusercontent.com/tompollard/tableone/master/data/pn2012_demo.csv"
data=pd.read_csv(url)

# columns to summarize
columns = ['Age', 'SysABP', 'death']

# columns containing categorical variables
categorical = ['death']

# non-normal variables
nonnormal = ['Age']

# limit the binary variable "death" to a single row
limit = {"death": 1}

# set the order of the categorical variables
order = {"death": ["1"]}

# alternative labels
labels={'death': 'Mortality'}

# set decimal places for age to 0
decimals = {"Age": 0}

# create tableone with the input arguments
mytable = TableOne(data, columns=columns, categorical=categorical, 
                   nonnormal=nonnormal, rename=labels, label_suffix=True, 
                   decimals=decimals, limit=limit, order=order)

print(mytable.tabulate(tablefmt = "github"))

		Missing	Overall
n			1000
Age, median [Q1,Q3]		0	68 [53,79]
SysABP, mean (SD)		291	114.3 (40.2)
Mortality, n (%)	1	0	136 (13.6)

christian-geier commented 4 years ago

Awesome, I'll try this out soon - thanks for keeping to improve this !

tompollard / tableone

Option to display data for binary variables in only a single row #59