zemirco / json2csv

Convert json to csv with column titles
http://zemirco.github.io/json2csv
MIT License
2.72k stars 363 forks source link

PowerShell skips first and last double quote of each line #362

Closed schematis closed 5 years ago

schematis commented 5 years ago

Package Version: 4.3.5 Node Version: 10.15.0 (Windows 10 and Ubuntu 18.04 running in WSL)

I created a command line tool to generate fake randomized person/address data and output to a CSV. Excel in Windows opens the file correctly when I run the command from Linux, but when I run the command from PowerShell on Windows 10, it drops the first double quote in each row and everything is shoved into a single column.

I've tried using various combinations of setting the withBom and excelStrings parameters, but neither has had any effect.

Code: https://github.com/schematis/data-randomizer/blob/master/lib/generate.js

juanjoDiaz commented 5 years ago

Quick googling tells me that it's not a problem with this library but a problem with PowerShell (https://stackoverflow.com/a/6714242).

I'm not a PowerShell user and I don't' expect many people to be using this with PowerShell since no one has noticed this until now. (I'd be happy to be proven wrong and understand why someone would use PowerShell 🙂)

Maybe you can try detecting if the caller is PowerShell (no clue of how) and escape the quotes by setting the doubleQuote option to ""? But maybe that would affect the double quotes that are already escaped? Or you could try some of the other proposed solutions on StackOverflow.

Sorry I can't be much help but I don't think that json2csv should include patches specific to a concrete system (like PowerShell). I'm happy to hear suggestions or receive PRs if you can thing of a better solution.

schematis commented 5 years ago

I solved it by making it a call to fs.writeFile instead of piping stdout to the csv directly.

My tests indicate that this is probably related to the utf-8 encoding issue that others have mentioned in past issues (specifically, opening the file in VSCode and re-saving as utf-8 fixes the issue, also fs.writeFile saves as utf-8 by default which supports the theory).

A suggestion would be to add a note to the readme regarding this behavior and maybe tweaking the examples to use fs.writeFile instead of console.log for writing to file.

juanjoDiaz commented 5 years ago

Great that you solved it!

The problem that has been reported in the past has to do with excel not supporting UTF-8 at all. Similarly, I think that the problem here is that PowerShell is doing the wrong escaping when printing to console (as reported in many online discussions).

But I agree that it would be good to document this PowerShell behaviour. I'll do so!

I'll close this then! Feel free to re-open if you think that there is something still to be done here.