Open ghost opened 3 years ago
Hi @sgummidipundi, you've raised a good point, which is that there is no "continuous" argument. At the moment, tableone expects you to define the categorical variables using the "categorical" argument. Anything else is then treated as continuous. I can see how this is confusing, especially when (as in your case) there are no categorical variables.
If you don't specify which variables are categorical, then then tableone attempts to guess (and, from your example, clearly doesn't do a great job!). In your example, you would need to provide an empty categorical argument. I've tried to recreate the example below:
# import packages
import pandas as pd
import tableone
# create sample dataframe
x = ([0.0] * 41639 +
[0.2] * 3 +
[0.25] * 1 +
[1] * 3 +
[10] * 806 +
[100] * 816 +
[1000] * 1488 +
[10000] * 57 +
[100000] * 3 +
[11000] * 2 +
[117000] * 7 +
[12] * 1 +
[1200] * 267 +
[12000] * 51)
data = pd.DataFrame(x, columns=["x"])
Based on the large number of observations and the limited number of unique values, tableone (incorrectly!) guesses that x
is categorical
t1 = tableone.tableone(data)
print(t1.tabulate(tablefmt = "github"))
Missing | Overall | ||
---|---|---|---|
n | 45144 | ||
x, n (%) | 0.0 | 0 | 41639 (92.2) |
0.2 | 3 (0.0) | ||
0.25 | 1 (0.0) | ||
1.0 | 3 (0.0) | ||
10.0 | 806 (1.8) | ||
100.0 | 816 (1.8) | ||
1000.0 | 1488 (3.3) | ||
10000.0 | 57 (0.1) | ||
100000.0 | 3 (0.0) | ||
11000.0 | 2 (0.0) | ||
117000.0 | 7 (0.0) | ||
12.0 | 1 (0.0) | ||
1200.0 | 267 (0.6) | ||
12000.0 | 51 (0.1) |
categorical
argumentt2 = tableone.tableone(data, categorical=[])
print(t2.tabulate(tablefmt = "github"))
Missing | Overall | ||
---|---|---|---|
n | 45144 | ||
x, mean (SD) | 0 | 93.5 (1764.8) |
Hello! Would just like to say fantastic package and great syntax for the function.
I seem to be having an issue with creating a table with continuous values. I'm sure I am probably doing something incorrectly on my end since it is basic functionality. When I try to do an easy example with a single continuous variable I get an output like below:
It is odd because clearly it is reading it as non-normal as I have specified (as indicated by the 'median [Q1, Q3]) but it seems to only give counts and frequencies, essentially treating it as categorical. I have also verified that the variable is of type float64. Is there any suggestions on how I can proceed and have it treat it as a continuous measure?
Thanks in advance