evanw / esbuild

An extremely fast bundler for the web
https://esbuild.github.io/
MIT License
38.02k stars 1.14k forks source link

Text loader doesn't remove byte order mark (BOM) #3935

Open floyd-may opened 1 day ago

floyd-may commented 1 day ago

When loading files using the text loader, the loader doesn't strip byte order marks from the beginning of the file. For HTML files, for instance, this can turn into awkward problems like having an HTML entity like  inserted into the DOM inadvertently. Example here:

https://esbuild.github.io/try/#YgAwLjI0LjAALS1idW5kbGUKLS1mb3JtYXQ9ZXNtCi0tb3V0ZmlsZT1vdXQuanMKLS1zb3VyY2VtYXAKLS1kcm9wLWxhYmVsczpERUJVRwotLW1pbmlmeS1pZGVudGlmaWVycwotLWxvYWRlcjouaHRtbD10ZXh0AGUAZW50cnkudHMAaW1wb3J0IGZpbGVUZXh0IGZyb20gIi4vZXhhbXBsZS5odG1sIjsKCmNvbnNvbGUubG9nKGZpbGVUZXh0KTsAAGV4YW1wbGUuaHRtbAD+u788ZGl2PmhlbGxvIHdvcmxkPC9kaXY+

Bear in mind the example shows the text content of the HTML file as: image

Whereas loading an HTML file with a BOM at the beginning in any reasonable text editor won't show that leading BOM.

I can work around it by ensuring that no text loader-loaded files have BOMs, but it does seem reasonable for the text loader to strip a leading BOM.

floyd-may commented 1 day ago

And I'm also glad to make an attempt at a PR if the maintainer(s) agree that BOMs should be stripped by the text loader.

SinnerAir commented 1 day ago

I fixed it by simply converting the html file from UTF-8 BOM to UTF-8 (without BOM)