Text loader doesn't remove byte order mark (BOM)

floyd-may commented 1 day ago

When loading files using the text loader, the loader doesn't strip byte order marks from the beginning of the file. For HTML files, for instance, this can turn into awkward problems like having an HTML entity like  inserted into the DOM inadvertently. Example here:

https://esbuild.github.io/try/#YgAwLjI0LjAALS1idW5kbGUKLS1mb3JtYXQ9ZXNtCi0tb3V0ZmlsZT1vdXQuanMKLS1zb3VyY2VtYXAKLS1kcm9wLWxhYmVsczpERUJVRwotLW1pbmlmeS1pZGVudGlmaWVycwotLWxvYWRlcjouaHRtbD10ZXh0AGUAZW50cnkudHMAaW1wb3J0IGZpbGVUZXh0IGZyb20gIi4vZXhhbXBsZS5odG1sIjsKCmNvbnNvbGUubG9nKGZpbGVUZXh0KTsAAGV4YW1wbGUuaHRtbAD+u788ZGl2PmhlbGxvIHdvcmxkPC9kaXY+

Bear in mind the example shows the text content of the HTML file as:

Whereas loading an HTML file with a BOM at the beginning in any reasonable text editor won't show that leading BOM.

I can work around it by ensuring that no text loader-loaded files have BOMs, but it does seem reasonable for the text loader to strip a leading BOM.

floyd-may commented 1 day ago

And I'm also glad to make an attempt at a PR if the maintainer(s) agree that BOMs should be stripped by the text loader.

SinnerAir commented 1 day ago

I fixed it by simply converting the html file from UTF-8 BOM to UTF-8 (without BOM)

evanw / esbuild

Text loader doesn't remove byte order mark (BOM) #3935