redoPop / SublimeGremlins

Reveal odd and invisible whitespace characters in Sublime Text.
43 stars 3 forks source link

Byte Order Mark, UTF-8 #14

Open AndydeCleyre opened 7 months ago

AndydeCleyre commented 7 months ago

Hello, and thanks for this!

I don't think the byte order mark (BOM) is currently handled, at least in its UTF-8 form.

From Wikipedia:

The UTF-8 representation of the BOM is the (hexadecimal) byte sequence EF BB BF.

The file I encountered this with is on GitHub at swlaschin/pipeline_oriented_programming_talk:fsharpdemo/CalculationExample.fsx

$ head -1 CalculationExample.fsx | cat -v
M-oM-;M-?// using a pipe when functions have exactly one parameter
$ file CalculationExample.fsx
CalculationExample.fsx: Unicode text, UTF-8 (with BOM) text
$ hd CalculationExample.fsx | head -1
00000000  ef bb bf 2f 2f 20 75 73  69 6e 67 20 61 20 70 69  |...// using a pi|