tompollard / tableone

Create "Table 1" for research papers in Python
https://pypi.python.org/pypi/tableone/
MIT License
161 stars 38 forks source link

"Buffer dtype mismatch, expected 'Python object' but got 'unsigned long'" #85

Closed ramseyjt closed 4 years ago

ramseyjt commented 4 years ago

Hello,

I'm very new to python. Tableone was working perfectly for me until I changed some of my columns to categorical. I tried looking around, but I was unable to find a solution on my own. Any help would be much appreciated.

Inputs:

data = df_03

columns = ['sex', 'race', 'eth', 'h_trans_stat','death_stat', 'LVAD_stat', 'diab_stat','afib_stat','smok_icd_stat','closest_bp_sys','closest_bp_dia','closest_bmi_val','closest_creat_val','closest_bnp_val','COPD_stat','closest_nyha_val','age_at_anchor','age_remove','age_diagnosis','hf_time','remove_stat','remove_binary','new_sex'] 

categorical = ['sex', 'race', 'eth', 'h_trans_stat','death_stat', 'LVAD_stat', 'diab_stat','afib_stat','smok_icd_stat','COPD_stat','closest_nyha_val','remove_stat','remove_binary','new_sex'] 

groupby = ['death_stat']

nonnormal = ['age_at_anchor','age_remove', 'age_diagnosis','hf_time']

labels={'race':'Race','em_hf_stat':'Heart Failure','closest_creat_val':'Serum Creatinine','closest_bnp_val':'Serum BNP','hf_time':'Years With HF','h_trans_stat':'Heart Transplant','closest_nyha_val':'NYHA','LVAD_stat':'LVAD','closest_bp_dia':'Diastolic BP','closest_bp_sys':'Systolic BP','death_stat': 'Mortality','sex':'Sex','eth':'Ethnicity','closest_bmi_val':'BMI','diab_stat':'Diabetic','afib_stat':'Afib','COPD_stat':'COPD'}

mytable = TableOne(data, columns, categorical, groupby, nonnormal, labels=labels, pval=False)

mytable
**output:**

ValueError                                Traceback (most recent call last)
<ipython-input-17-ac2a61275e0c> in <module>()
     16 labels={'race':'Race','em_hf_stat':'Heart Failure','closest_creat_val':'Serum Creatinine','closest_bnp_val':'Serum BNP','hf_time':'Years With HF','h_trans_stat':'Heart Transplant','closest_nyha_val':'NYHA','LVAD_stat':'LVAD','closest_bp_dia':'Diastolic BP','closest_bp_sys':'Systolic BP','death_stat': 'Mortality','sex':'Sex','eth':'Ethnicity','closest_bmi_val':'BMI','diab_stat':'Diabetic','afib_stat':'Afib','COPD_stat':'COPD'}
     17 
---> 18 mytable = TableOne(data, columns, categorical, groupby, nonnormal, labels=labels, pval=False)
     19 
     20 mytable

/opt/anaconda/anaconda3/lib/python3.6/site-packages/tableone.py in __init__(self, data, columns, categorical, groupby, nonnormal, pval, pval_adjust, isnull, ddof, labels, sort, limit, remarks, label_suffix)
    160         if self._continuous:
    161             self.cont_describe = self._create_cont_describe(data)
--> 162             self.cont_table = self._create_cont_table(data)
    163 
    164         # combine continuous variables and categorical variables into table 1

/opt/anaconda/anaconda3/lib/python3.6/site-packages/tableone.py in _create_cont_table(self, data)
    603         nulltable = data[self._continuous].isnull().sum().to_frame(name='isnull')
    604         try:
--> 605             table = table.join(nulltable)
    606         except TypeError: # if columns form a CategoricalIndex, need to convert to string first
    607             table.columns = table.columns.astype(str)

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in join(self, other, on, how, lsuffix, rsuffix, sort)
   6334         # For SparseDataFrame's benefit
   6335         return self._join_compat(other, on=on, how=how, lsuffix=lsuffix,
-> 6336                                  rsuffix=rsuffix, sort=sort)
   6337 
   6338     def _join_compat(self, other, on=None, how='left', lsuffix='', rsuffix='',

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in _join_compat(self, other, on, how, lsuffix, rsuffix, sort)
   6349             return merge(self, other, left_on=on, how=how,
   6350                          left_index=on is None, right_index=True,
-> 6351                          suffixes=(lsuffix, rsuffix), sort=sort)
   6352         else:
   6353             if on is not None:

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/reshape/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
     60                          copy=copy, indicator=indicator,
     61                          validate=validate)
---> 62     return op.get_result()
     63 
     64 

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/reshape/merge.py in get_result(self)
    579         result_data = concatenate_block_managers(
    580             [(ldata, lindexers), (rdata, rindexers)],
--> 581             axes=[llabels.append(rlabels), join_index],
    582             concat_axis=0, copy=self.copy)
    583 

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in append(self, other)
   2139         name = None if len(names) > 1 else self.name
   2140 
-> 2141         return self._concat(to_concat, name)
   2142 
   2143     def _concat(self, to_concat, name):

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/category.py in _concat(self, to_concat, name)
    774     def _concat(self, to_concat, name):
    775         # if calling index is category, don't check dtype of others
--> 776         return CategoricalIndex._concat_same_dtype(self, to_concat, name)
    777 
    778     def _concat_same_dtype(self, to_concat, name):

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/category.py in _concat_same_dtype(self, to_concat, name)
    781         ValueError if other is not in the categories
    782         """
--> 783         to_concat = [self._is_dtype_compat(c) for c in to_concat]
    784         codes = np.concatenate([c.codes for c in to_concat])
    785         result = self._create_from_codes(codes, name=name)

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/category.py in <listcomp>(.0)
    781         ValueError if other is not in the categories
    782         """
--> 783         to_concat = [self._is_dtype_compat(c) for c in to_concat]
    784         codes = np.concatenate([c.codes for c in to_concat])
    785         result = self._create_from_codes(codes, name=name)

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/category.py in _is_dtype_compat(self, other)
    238                 values = [values]
    239             other = CategoricalIndex(self._create_categorical(
--> 240                 self, other, categories=self.categories, ordered=self.ordered))
    241             if not other.isin(values).all():
    242                 raise TypeError("cannot append a non-category item to a "

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/category.py in _create_categorical(self, data, categories, ordered, dtype)
    165             from pandas.core.arrays import Categorical
    166             data = Categorical(data, categories=categories, ordered=ordered,
--> 167                                dtype=dtype)
    168         else:
    169             if categories is not None:

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    368 
    369         else:
--> 370             codes = _get_codes_for_values(values, dtype.categories)
    371 
    372         if null_mask.any():

/opt/anaconda/anaconda3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in _get_codes_for_values(values, categories)
   2431     (_, _), cats = _get_data_algo(categories, _hashtables)
   2432     t = hash_klass(len(cats))
-> 2433     t.map_locations(cats)
   2434     return coerce_indexer_dtype(t.lookup(vals), cats)
   2435 

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.StringHashTable.map_locations()

**ValueError: Buffer dtype mismatch, expected 'Python object' but got 'unsigned long'**
tompollard commented 4 years ago

@ramseyjt thanks for raising this issue. If possible, please could you post a sample of data that generates the error?

ramseyjt commented 4 years ago
image image
tompollard commented 4 years ago

Thanks, we can work it out this way. If possible, it would save us time (and speed things up) if you could post a sample dataset to reproduce the issue, e.g.: like this:

df = pd.DataFrame({'sex': [0, 1, 0, 0],
    'age': [2, 3, 4, 5],
    'age_remove': ["y", "n", "y", "n"]}) 
tompollard commented 4 years ago

I had a quick look at this issue, but I wasn't able to reproduce the error. Please could you post the tableone version number that you are using?

import tableone
print(tableone.__version__)

If you are not using version 0.6.1 or higher, then please run conda update tableone to update, then try again.

ramseyjt commented 4 years ago

Tom,

I was running version 0.5.13 without realizing it. Updating solved the problem. I appreciate your help!

Best,

Jake

On Nov 13, 2019, at 4:52 PM, Tom Pollard notifications@github.com<mailto:notifications@github.com> wrote:

I had a quick look at this issue, but I haven't been able to reproduce the error. Please could you post the version number that you are using?

import tableone print(tableone.version)

If you are not using version 0.6.1 or higher, then please run conda update tableone to update, then try again.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/tompollard/tableone/issues/85?email_source=notifications&email_token=ANRUPVNN5QJO4ISKX6R6VGTQTSALDA5CNFSM4JIWEW3KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED76LSY#issuecomment-553641419, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANRUPVKJKZWQG2R2CCYM2ITQTSALDANCNFSM4JIWEW3A.

tompollard commented 4 years ago

Great, thanks for letting me know. Easy fix then :)