javascriptdata / danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
https://danfo.jsdata.org/
MIT License
4.81k stars 209 forks source link

Concat behaviour differs significantly from pandas (column order dependent) #641

Open bml1g12 opened 7 months ago

bml1g12 commented 7 months ago

Describe the bug

DataFrame creation is column order dependent, unlike Pandas

Definition of an Object from ECMAScript Third Edition (pdf):

4.3.3 Object An object is a member of the type Object. It is an unordered collection of properties each of which contains a primitive value, object, or function. A function stored in a property of an object is called a method.

To Reproduce

Danfo.js v1.1.2

const json_data_3d = [
  { campaignId: "toyota", agentId: "bob", metricValue: 1, metricId: "callCount" },
  { campaignId: "toyota", agentId: "jim", metricValue: 2, metricId: "callCount" },
  { campaignId: "sony", agentId: "ben", metricValue: 3, metricId: "callCount" },
  { campaignId: "sony", agentId: "karl", metricId: "callCount", metricValue: 4,},
];
const dfone = new dfd.DataFrame(json_data_3d);
dfone.print()
╔════════════╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
║            │ campaignId        │ agentId           │ metricValue       │ metricId          ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 0          │ toyota            │ bob               │ 1                 │ callCount         ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 1          │ toyota            │ jim               │ 2                 │ callCount         ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 2          │ sony              │ ben               │ 3                 │ callCount         ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 3          │ sony              │ karl              │ callCount         │ 4                 ║
╚════════════╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

Expected behavior

import pandas as pd

const json_data_3d = [
  { campaignId: "toyota", agentId: "bob", metricValue: 1, metricId: "callCount" },
  { campaignId: "toyota", agentId: "jim", metricValue: 2, metricId: "callCount" },
  { campaignId: "sony", agentId: "ben", metricValue: 3, metricId: "callCount" },
  { campaignId: "sony", agentId: "karl", metricValue: 4, metricId: "callCount" },
];
const dfone = new dfd.DataFrame(json_data_3d);)
print(dfone)
  campaignId agentId  metricValue   metricId
0     toyota     bob            1  callCount
1     toyota     jim            2  callCount
2       sony     ben            3  callCount
3       sony    karl            4  callCount