Closed Make42 closed 5 years ago
Try creating a single new key column which is a combination of the key columns, then join on this new key column.
Does this work?
On Tue, 31 Jul 2018 16:04 Make42, notifications@github.com wrote:
I think joining on different columns does not work. By that I mean
a_df = pd.DataFrame.from_items([('one', [1,2,3]),('two',['a','b','c'])]) b_df = pd.DataFrame.from_items([('three', [1,2,3]),('four',['d','e','f'])]) a_df >> inner_join(b_df,by=['one','three'])
gives the error
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'one'
and
a_df >> inner_join(b_df,by=[['one'],['three']])
gives
IndexError: list index out of range
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kieferk/dfply/issues/61, or mute the thread https://github.com/notifications/unsubscribe-auth/ABOypKWK8xPccVAEptQQ4OzNV2B7yd6Jks5uMHHYgaJpZM4VobuN .
This was indeed a bug. Should be fixed now, pull down the master branch and check it out, let me know if you have additional issues.
Thank you! Please push to Anaconda if possible.
I can confirm in 0.3.3, issue still same
I think joining on different columns does not work. By that I mean
gives the error
and
a_df >> inner_join(b_df,by=[['one'],['three']])
gives
IndexError: list index out of range