Returns the following unexpected errors when encountering UTF BOM/signatures.
1:1: ERROR: Expected a doctype token
<!DOCTYPE html>
^
1:2: ERROR: This is not a legal doctype
<!DOCTYPE html>
^
Expected behaviour: Check the first bytes of the document and detect BOM byte sequence. Set the document encoding to the encoding indicated by the BOM sequence (e.g. UTF-8 or UTF-16 LE). Strip the BOM sequence and proceed with parsing the document as normal.
Returns the following unexpected errors when encountering UTF BOM/signatures.
Expected behaviour: Check the first bytes of the document and detect BOM byte sequence. Set the document encoding to the encoding indicated by the BOM sequence (e.g. UTF-8 or UTF-16 LE). Strip the BOM sequence and proceed with parsing the document as normal.
https://encoding.spec.whatwg.org/#decode https://html.spec.whatwg.org/#writing
Some test cases:
UTF-8 signature mark:
UTF-16 (BE) byte-order-mark:
UTF-16 (LE) byte-order-mark: