Open miguelusque opened 2 years ago
Thanks @miguelusque for raising this issue. ~I know that we've had trouble supporting the pandas MultiIndex behavior. I believe there was a proposal to drop MultiIndex support - how big of an impact would that be for the users you've worked with?~ To my surprise, it is cudf
that is generating the MultiIndex
- we should just return a simple Index
instead!
Hi @GregoryKimball , thank you!.
Please find below the original code that I was porting from Pandas to cuDF. Unfortunately, .add_prefix()
and .and_suffix()
methods do not work with MultiIndex
.
Original code:
df = df.to_pandas()
# Which department have user ordered products?
df_ = pd.crosstab(df.user_id, df.department_id).add_prefix('user_department_').add_suffix('_freq')
feature_list.append(cudf.from_pandas(df_))
Workaround:
# Which department have user ordered products?
df_ = cudf.crosstab(df.user_id, df.department_id)
df_.columns = ['user_department_' + str(c[0]) + '_freq' for c in df_.columns]
feature_list.append(df_)
Hope it helps!
Describe the bug Hi
While porting some code from Pandas, I have noticed that the column types after
cudf.crosstab()
does not match Pandas result.Please, see a reproducer below:
Expected behavior I would like the results between cuDF and Pandas match.