zemirco / json2csv

Convert json to csv with column titles
http://zemirco.github.io/json2csv
MIT License
2.72k stars 363 forks source link

Headers as column #296

Closed dzmitryafanasenka closed 6 years ago

dzmitryafanasenka commented 6 years ago

Hello. I'm using json2csv v4.1.2 as js module for node.js v10.0.0. Can I make headers as column?

Now:

| Car | Color | | Audi | Red | | Mazda | White |

What I want: | Car | Audi | Mazda | | Color | Red | White |

My code:

const Json2csvParser = require('json2csv').Parser; const json2csvParser = new Json2csvParser({ fields, unwind: ['car', ''color] }); const csv = json2csvParser.parse(json);

knownasilya commented 6 years ago

What's the use case for this format?

dzmitryafanasenka commented 6 years ago

@knownasilya The desire of the customer. He wants CSV with the format like:
Field - Values. Field - Values. No matter, I did it without json2csv.

juanjoDiaz commented 6 years ago

Hi @dzmitryafanasenka,

That's not possible since JSON stores data by row and not by column; i.e. each JSON in the array naturally represents a row and not a column.

Also, imaging an object with 100k rows. If you do it as you suggest, the resulting CSV rows will be enormous to be readable.

However, everything is possible with a bit of preprocessing...

var Json2csvParser = require("json2csv").Parser;

const data = [
  { car: 'Audi', color: 'Red' },
  { car: 'Mazda', color: 'White' }
];

const fields = ['car', 'color'];
// You can also infer them automatically
// const fields = Object.keys(data[0]);

const processedData = fields
  .map(key => data.reduce((acc, elem, i) => {
    acc[i + 1] = elem[key];
    return acc;
  }, { 0: key }));

const json2csvParser = new Json2csvParser({ header: false });
const csv = json2csvParser.parse(processedData);

Hope that helps! Feel free to ask if anything is unclear.

I'll close this but please feel free to reopen if you think that something else could be done.

jacobq commented 5 years ago

I have seen this before also, usually associated with the term "column-oriented" / "orientation = column". It's a pain in my opinion because it doesn't lend itself nicely to streaming, and it also isn't RFC4180 compliant, if I understand the spec properly. Perhaps it is preferred for human-readability reasons when the number of records is much smaller than the number of fields?

My usual approach in these cases is to treat everything as row-oriented until the very end (e.g. just before writing to disk). At that point, if needed, the data can be transposed (e.g. using simple for loops or perhaps csv-transpose if already flushed). While this is less efficient, it keeps the code fairly neat, and in my case that performance reduction is nearly always imperceptible because of the relatively small sizes of the datasets I'm working on.