mafintosh / csv-parser

Streaming csv parser inspired by binary-csv that aims to be faster than everyone else
MIT License
1.41k stars 134 forks source link

Leading zero-width no-break space in keys #140

Closed James-Quigley closed 4 years ago

James-Quigley commented 4 years ago

Expected Behavior

With a CSV looking like:

name,start_date,end_date
some - name,2019-08-25,2019-12-31
another - name,2019-08-05,2019-12-31

after parsing, accessing row.name should return a string

Actual Behavior

The row property doesn't actually have name, but rather name preceeded by the utf char 65279, or a zero-width no-break space. I've created the CSV by hand, so I have no idea why that character would be in there

How Do We Reproduce?

fs.createReadStream(input_path)
        .pipe(parse())
        .on('data', (data) => console.log(data.name)) // undefined :(
        .on('end', () => {});
shellscape commented 4 years ago

Please use JSON.stringify on each row and paste the result here. Chances are your editor is inserting a Byte Order Mark (BOM), in which case you should strip that character before sending data to csv-parser, but let's have a look at that JSON first.

James-Quigley commented 4 years ago

I believe that is the case. I'm using VS Code. Is there a way to not save the BOM? Or do I just need to strip that leading character?

shellscape commented 4 years ago

I'm not a VS Code user, so I'm afraid you'd have to ask those folks. But you can use intermediate tools like https://github.com/sindresorhus/strip-bom and https://github.com/sindresorhus/strip-bom-cli to do that as part of the process.

fedpettinella commented 4 years ago

I found this issue as well. The CSVs were created using Excel. First saved as xlsx, then as CSV. Printing the JSON.stringify() showed the header as ' USER_NAME' instead of the expected 'USER_NAME'. I wasn't able to get around this by modifying the CSV file, but I could instead trim the headers using the included mapHeaders function:

.pipe(csv({
    mapHeaders: ({ header, index }) => header.trim()
  }))
emmanuel-a-g commented 3 years ago

this is exactly what I needed, thank you! @fedpettinella