Open charlesbluca opened 3 years ago
This issue has been labeled inactive-90d
due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
Is your feature request related to a problem? Please describe. Categorical column indexes exists in a weird place of quasi-support in cuDF; while it is possible to set a dataframe's column index to be a
pd.CategoricalIndex
without any error or warning, it isn't actually possible for the index to be recreated withdf.columns
, which contrasts the behavior of Pandas:This means that while there are user-facing issues which come as a result of using cuDF's "categorical" column indexes (such as #7365), the ability to test for them is limited in that we cannot do the standard comparison to Pandas dataframes here:
Describe the solution you'd like After chatting with @shwina, it seems like an ideal solution that can't be done here is to use the individual categorical scalars instead of their string names as data when constructing the
ColumnAccessor
in the columns setter method. However, this isn't possible, as neither Pandas nor cuDF offer categorical scalars.An alternative to this would be to have a boolean attribute either of the dataframe or
ColumnAccessor
saying whether or not the column index is categorical; this could then be used byColumnAccessor.to_pandas_index()
to properly reconstruct the index with categories if needed. This would come with its own consequences, specifically eitherColumnAccessor
that is only used for dataframesDescribe alternatives you've considered A possible alternative that @shwina and I explored, but were unable to get working, is to pass specific kwargs to
assert_eq
such that it would only check the column index names, but not the index type. Passing different combos ofcheck_categorical=False
,check_column_type=False
, etc. we were unable to get a passing test when comparing these indexes.Additional context This issue came up while working on #8560, where added test cases would require this feature and needed to be xfailed.