szwacz / fs-jetpack

Better file system API for Node.js
MIT License
777 stars 41 forks source link

read() fails on files with BOM when returnAs is set to json #100

Open erikmhauck opened 3 years ago

erikmhauck commented 3 years ago

code to repro (test.json must be a file with BOM at beginning of file)

const jetpack = require('fs-jetpack');
data = jetpack.read("test.json", "json")

This happens quite easily if you generate a file using "Out-File" in powershell 5 (the default in Windows)

szwacz commented 3 years ago

Does standard readFile https://nodejs.org/api/fs.html#fs_fs_readfile_path_options_callback behave differently? Because fs-jetpack with this method is just a sugar syntax.

erikmhauck commented 3 years ago

This doesn't happen in standard readFile from node because node supports utf16le as an encoding option, and doesn't have 'json' that assumes utf8 encoding.

Here is a more complete example that demonstrates this:

const jetpack = require('fs-jetpack');
const fs = require('fs');

//read a file with a BOM at beginning as utf8
let fsdata = fs.readFileSync('with_bom.json', {encoding: 'utf8'});
console.log(fsdata); //prints ��{"test": true}

//read a file with a BOM at beginning as utf16le
fsdata = fs.readFileSync('with_bom.json', {encoding: 'utf16le'});
console.log(fsdata); //prints {"test": true}

//read a file with a BOM using the 'json' option of fs-jetpack
let data = jetpack.read('with_bom.json', 'json');
console.log(data); //throws exception because BOM is not valid json

You can make "with_bom.json" like this:

echo {"test":true} >> without_bom.json
iconv -f iso-8859-1 -t utf-16le < without_bom.json > with_bom.json

Just encoding the file as utf-16le will add the BOM to the beginning

szwacz commented 3 years ago

Now I see what you mean. This "return as json" is ofcorse just a sugar syntax and it needs to assume some encoding of the file to be able to read it, so this is not a bug, just lack of feature.

Regarding introducing utf16 support I'd need to think more about it, because none of the fs-jetpack methods right now support it natively, although you still can use this encoding perfectly fine, just need to read and write buffers to file system, and encode and decode those buffers as utf16 in your code.