data-forge / data-forge-ts

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
http://www.data-forge-js.com/
MIT License
1.33k stars 77 forks source link

Columns are badly formatted #162

Closed triposat closed 1 year ago

triposat commented 1 year ago

When I am running the below code for the same data, then why are columns so badly formatted, with titles not matching the columns.

Code:

// Import modules
const dataForge = require("data-forge");

// Load the data into a data frame
const dataFrame = new dataForge.DataFrame({
  columnNames: ["date", "product", "price", "quantity"],
  rows: [
    ["01-03-2023", "Apples", 3, 10],
    ["02-03-2023", "Bananas", 2, 20],
    ["03-03-2023", "Oranges", 4, 5],
    ["05-03-2023", "Strawberries", 8, 8],
    ["06-03-2023", "Blueberries", 7, 12],
    ["09-03-2023", "Grapes", 9, 5],
    ["10-03-2023", "Kiwis", 8, 15],
    ["15-03-2023", "Pineapples", 7, 15],
    ["03-04-2023", "Papayas", 6, 20],
  ],
});

// Group the data by column 'product' and count the values
const groupedData = dataFrame.groupBy((row) => row.product);

// Display grouped data
console.log(groupedData.toString());

Output:

Columns so badly formatted with titles not matching the columns:

7iNyCVi

ashleydavis commented 1 year ago

Thanks for reporting the issue. I'm accepting PRs if you'd like to try correcting the problem.

triposat commented 1 year ago

I tried but was unable to find a solution.

ashleydavis commented 1 year ago

I figured out what the problem!

The thing is that after you call groupBy the result is a Series of DataFrames, not just a DataFrame, so the result is bound to look whacky.

There's two ways to fix this, 1st is that if you don't do the groupBy the table looks fine:

// Import modules
const dataForge = require("data-forge");

// Load the data into a data frame
const dataFrame = new dataForge.DataFrame({
  columnNames: ["date", "product", "price", "quantity"],
  rows: [
    ["01-03-2023", "Apples", 3, 10],
    ["02-03-2023", "Bananas", 2, 20],
    ["03-03-2023", "Oranges", 4, 5],
    ["05-03-2023", "Strawberries", 8, 8],
    ["06-03-2023", "Blueberries", 7, 12],
    ["09-03-2023", "Grapes", 9, 5],
    ["10-03-2023", "Kiwis", 8, 15],
    ["15-03-2023", "Pineapples", 7, 15],
    ["03-04-2023", "Papayas", 6, 20],
  ],
});

// Display grouped data
console.log(dataFrame.toString());
__index__  date        product       price  quantity
---------  ----------  ------------  -----  --------
0          01-03-2023  Apples        3      10
1          02-03-2023  Bananas       2      20
2          03-03-2023  Oranges       4      5
3          05-03-2023  Strawberries  8      8
4          06-03-2023  Blueberries   7      12
5          09-03-2023  Grapes        9      5
6          10-03-2023  Kiwis         8      15
7          15-03-2023  Pineapples    7      15
8          03-04-2023  Papayas       6      20
ashleydavis commented 1 year ago

The other thing you can do is to summarise or aggregate your groups then convert them back to a DataFrame:

// Import modules
const dataForge = require("data-forge");

// Load the data into a data frame
const dataFrame = new dataForge.DataFrame({
  columnNames: ["date", "product", "price", "quantity"],
  rows: [
    ["01-03-2023", "Apples", 3, 10],
    ["02-03-2023", "Bananas", 2, 20],
    ["03-03-2023", "Oranges", 4, 5],
    ["05-03-2023", "Strawberries", 8, 8],
    ["06-03-2023", "Blueberries", 7, 12],
    ["09-03-2023", "Grapes", 9, 5],
    ["10-03-2023", "Kiwis", 8, 15],
    ["15-03-2023", "Pineapples", 7, 15],
    ["03-04-2023", "Papayas", 6, 20],
  ],
});

// Group the data by column 'product' and count the values
const groupedData = dataFrame.groupBy((row) => row.product);

// Aggregated groups.
const aggregatedData = groupedData.inflate((group) => {
  return {
    product: group.first().product,
    count: group.count(),
    averagePrice: group.deflate(row => row.price).average(),
  };
});

console.log(aggregatedData.toString());

You'll find the table looks ok then:


__index__  product       count  averagePrice
---------  ------------  -----  ------------
0          Apples        1      3
1          Bananas       1      2
2          Oranges       1      4
3          Strawberries  1      8
4          Blueberries   1      7
5          Grapes        1      9
6          Kiwis         1      8
7          Pineapples    1      7
8          Papayas       1      6
ashleydavis commented 1 year ago

Thanks for logging the issue.