javascriptdata / danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
https://danfo.jsdata.org/
MIT License
4.79k stars 209 forks source link

Dataframes Merge Issue #45

Closed Pl-Mrcy closed 4 years ago

Pl-Mrcy commented 4 years ago

Joining with dfd.merge two dataframes fails whenever the left dataframe has more rows than the right dataframe (no matter if you do a left, right, inner or outer join).

Similarly, the output of a right join (with a longer dataframe as the right dataframe) is wrong. It outputs only the number of rows of the left dataframe.

A basic example that ends up falling on my machine:

const dfd = require("danfojs-node");

const arrayShort = [
    {
        label: "ABC",
        value: 1
    },
    {
        label: "DEF",
        value: 2
    },
    {
        label: "GHI",
        value: 3
    }
];
const dfShort = new dfd.DataFrame(arrayShort);

const arrayLong = [
    {
        label: "ABC",
        value2: 4
    },
    {
        label: "DEF",
        value2: 5
    },
    {
        label: "JKL",
        value2: 6
    },
    {
        label: "MNO",
        value2: 7
    }
];
const dfLong = new dfd.DataFrame(arrayLong);

console.log("Short: ");
dfShort.print();

console.log("Long: ");
dfLong.print();

howStr = "left"
console.log("Long as right dataframe");
try {
    dfd.merge({
        left: dfShort,
        right:dfLong,
        on: ["label"],
        how: howStr
    }).print()
    console.log("Merge Succeeded");
} catch (err) {
    console.log("Merge failed");
    console.log(err);
}

console.log("Long as left dataframe");
try {
    dfd.merge({
        left: dfLong,
        right:dfShort,
        on: ["label"],
        how: howStr
    }).print()
    console.log("Merge Succeeded");
} catch (err) {
    console.log("Merge failed");
    console.log(err);
}

The second attempt to join fails.

Merge failed
TypeError: Cannot read property '0' of undefined
steveoni commented 4 years ago

I was not able to reproduce your error on Dnotebook.

What version of the library are you using?

Pl-Mrcy commented 4 years ago

I was able to reproduce it using Dnotebook.

const arrayShort = [
    {
        label: "ABC",
        value: 1
    },
    {
        label: "DEF",
        value: 2
    },
    {
        label: "GHI",
        value: 3
    }
];
const dfShort = new dfd.DataFrame(arrayShort);
const arrayLong = [
    {
        label: "ABC",
        value2: 4
    },
    {
        label: "DEF",
        value2: 5
    },
    {
        label: "JKL",
        value2: 6
    },
    {
        label: "MNO",
        value2: 7
    }
];
const dfLong = new dfd.DataFrame(arrayLong);

// 1
let howStr = "left"
let tmp = dfd.merge({
  left: dfLong,
  right:dfShort,
  on: ["label"],
  how: howStr
})
table(tmp.head());

// 2
howStr = "right"
tmp = dfd.merge({
  left: dfShort,
  right:dfLong,
  on: ["label"],
  how: howStr
})
table(tmp.head());

These two attempts give an error (TypeError: i is undefined) and a wrong output respectively.

I use danfojs-node@0.1.5

steveoni commented 4 years ago

Thanks, I was able to reproduce the error. Its now fixed

kickbox commented 4 years ago

May I know when would this fix be released in node npm?