rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.01k stars 869 forks source link

[FEA] Support "on" parameter in cudf.DataFrame.join #9512

Open krlng opened 2 years ago

krlng commented 2 years ago

Version used: cuml==21.8.2

Steps/Code to reproduce bug

left = cudf.DataFrame([100, 101], columns=["item_id"])
right = cudf.DataFrame(["a","b"], index=[100,101], columns=["item_name"])

# This will result in having only <NA> in item_name
left.join(right, on="item_id")

# This works as expected
left.to_pandas().join(right.to_pandas(), on="item_id")

# Workarround
left.merge(right, left_on="item_id", right_index=True)
beckernick commented 2 years ago

We don't currently support the on keyword for join, as noted in the docstring. I'm going to convert this issue to a feature request to support on.

Are you able to instead use merge for your use case per your workaround?

krlng commented 2 years ago

Yes, sorry, did not see that comment in the docstring

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.