Closed jbvsmo closed 6 years ago
Hey @jbvsmo! Thank you for your issue.
The problems were resolved with the issue #6 and a new version of the parser should be up really soon.
I think this can be closed now - my previous commit now handles BOM.
Nevermind - I see Nick commented on this already.
It sure does :) I just published a new version @jbvsmo https://pypi.org/project/python-gedcom/ ✌
I don't know what the gedcom 5.5 format says about this, but for the sake of simplicity and because most text editors nowadays add it by default, this code should detect and ignore an UTF-8 BOM mark at the start of the file.
It is super complicated to understand why the loading failed because it only says:
Line 1 of document violates GEDCOM format 5.5
and nothing more. Because these bytes are meant to be ignored, you can't see the issue on line 1 unless you load the file in python and print a representation of said line.One option is to use the
utf-8-sig
codec instead. https://docs.python.org/3/library/codecs.html#module-encodings.utf_8_sig