Closed lkingsford closed 4 years ago
If you advise whether you're concerned about whether the BOM is going to be useful (for instance, if making the CSV on a different processor), then I can probably make a pull request with a fix when I get some time later this week.
This has to be solved, and it looks like there is a simple solution in python to just change the encoding type when opening a file from utf-8 to utf-8-sig.
A BOM in UTF-8 is of course ridiculous though, and only a cause of problems (like this one...). I had issues in the past, in this tool or some other code I worked on, because some Microsoft application put a BOM in UTF-8 XML documents. Really Microsoft should read the Wikipedia article on this subject, because even that knows better than they do on this subject. https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
Current Behaviour
When you create a UTF-8 CSV file with Excel, it creates a UTF-8 byte order marker (BOM) in the first 3 bytes -
EF BB BF
. When a BOM is present, the script reads the first header item with the BOM in the string, and can't find the rectangle. The error is listed. When you manually remove the BOM (for instance, by changing the encoding in VSCode), the file loads correctly, but UTF-8 Characters are not correctly read (`"Nurſe" in the CSV becomes "NurÅ¿e" on the card - but this is a different issue).Expected Behaviour
The BOM is read and the byte-order of the file set accordingly. Alternatively, if other architectures are not a concern, the BOM is ignored.
Justification
The BOM is not uncommon - at least in the Microsoft space, with them being produced by (at least) Visual Studio, Excel and Notepad, as well as Google Docs. I use Excel for editing my CSVs of the data for my game.
Additional information
Example error
Example CSV file
Attached
Data example.zip