javascriptdata / danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
https://danfo.jsdata.org/
MIT License
4.81k stars 209 forks source link

How to add, multiply, etc. dataframes considering multi column index? #496

Open stefaneidelloth opened 2 years ago

stefaneidelloth commented 2 years ago

Lets say I have two data frames, using two id columns to identify the rows:

A: | id_foo | id_baa | value | | 1 | 1 | 10 | | 1 | 2 | 20 | | 2 | 1 | 30 | | 2 | 2 | 40 |

B: | id_foo | id_baa | value | | 1 | 1 | 100 | | 1 | 2 | 200 | | 2 | 1 | 300 | | 2 | 2 | 400 |

=> What is the recommend way to add those two dataframes, so that the rows are matched byid_foo andid_baa and the entries of the value columns are added?

Expected result:

C: | id_foo | id_baa | value | | 1 | 1 | 110 | | 1 | 2 | 220 | | 2 | 1 | 330 | | 2 | 2 | 440 |

a.setIndex({columns: ['id_foo', 'id_baa' ], inplace: true, drop: true});

However, setIndex only seems to support a single column (?) https://danfo.jsdata.org/api-reference/dataframe/dataframe.set_index

Related: https://github.com/javascriptdata/danfojs/issues/101

risenW commented 2 years ago

Multi

Lets say I have two data frames, using two id columns to identify the rows:

A: | id_foo | id_baa | value | | 1 | 1 | 10 | | 1 | 2 | 20 | | 2 | 1 | 30 | | 2 | 2 | 40 |

B: | id_foo | id_baa | value | | 1 | 1 | 100 | | 1 | 2 | 200 | | 2 | 1 | 300 | | 2 | 2 | 400 |

=> What is the recommend way to add those two dataframes, so that the rows are matched byid_foo andid_baa and the entries of the value columns are added?

Expected result:

C: | id_foo | id_baa | value | | 1 | 1 | 110 | | 1 | 2 | 220 | | 2 | 1 | 330 | | 2 | 2 | 440 |

  • I tried to create multi column index with

a.setIndex({columns: ['id_foo', 'id_baa' ], inplace: true, drop: true});

However, setIndex only seems to support a single column (?) https://danfo.jsdata.org/api-reference/dataframe/dataframe.set_index

Related: #101

Multi column index is not supported at the moment

kitfit-dave commented 1 year ago

Question: does this even work with a single column index at the moment? I.e. is the pandas DataFrame.add behaviour where it will add values from rows by matching on their row index (where there can be holes in one frame or the other) implemented?

edit: oh sorry, I did just notice #101 - so that is explained there.