gavinr / github-csv-tools

Import and export GitHub issues via CSV
https://npmjs.com/github-csv-tools
MIT License
650 stars 116 forks source link

fix: strip the UTF8 BOM #85

Closed pgerlach closed 1 year ago

pgerlach commented 2 years ago

The input file is read as UTF8, and in csv-parse documentation is written "It is recommended to always activate this option when working with UTF-8 files." (https://csv.js.org/parse/options/bom/).

This fixes the case where there is a BOM, in which case the first column was not detected, because it includes the BOM character as the first char of the first column name.

If the file has no BOM, then the option does nothing.

gavinr commented 1 year ago

Thanks for this. Can you please provide an example CSV file that is currently breaking that this PR fixes?

pgerlach commented 1 year ago

Sure ! This is an export from Excel choosing the format "CSV UTF-8".

csv-file-with-utf8-bom.csv

hexdump shows that it begins with the UTF-8 BOM 0xefbbbf.

$ hexdump -C csv-file-with-utf8-bom.csv
00000000  ef bb bf 74 69 74 6c 65  2c 62 6f 64 79 0d 0a 55  |...title,body..U|
00000010  54 46 2d 38 20 42 4f 4d  2c 68 61 6e 64 6c 65 20  |TF-8 BOM,handle |
00000020  55 54 46 2d 38 20 66 69  6c 65 73 20 77 69 74 68  |UTF-8 files with|
00000030  20 42 4f 4d                                       | BOM|
00000034

githubCsvTools can't parse it. But it can parse the same file with the bom removed.

csv-file-without-utf8-bom.csv

gavinr commented 1 year ago

thanks!

github-actions[bot] commented 1 year ago

:tada: This PR is included in version 3.1.7 :tada:

The release is available on:

Your semantic-release bot :package::rocket: