…saving the C++ code from receiving a Node-style encoding name that it doesn’t recognize.
Our TextBuffer object acts as a bridge between TextEditor and lower-level APIs like pathwatcher and superstring. And superstring, in turn, presents a JavaScript API but implements lots of its internals in C++ for speed.
But that’s where a bug can slip in, because JavaScript and C++ use different conventional names for common text encodings… including UTF-8, the commonest of all (in our particular domain). Node likes to call it utf8, but the superstring C++ code insists upon the more formal UTF-8. There’s a utility function called normalizeEncoding within the superstring JavaScript that converts Node-style encoding names to C++-style encoding names.
I’m not going to be able to demonstrate exactly how this bug happens, but take a look at superstring/index.js and search for the string encoding. You’ll see that, aside from the aforementioned normalizeEncoding utility, three different methods take an encoding parameter in some form: TextBuffer#load, TextBuffer#save, and TextBuffer#baseTextMatchesFile.
You might also notice that the first two of those methods pass their encoding parameter through the normalizeEncoding function almost immediately — but baseTextMatchesFile doesn’t. After lots of analysis and reflection, I can’t come up with a single reason why it shouldn’t normalize the encoding; it would seem to save me from this exception I kept getting today while working on a new package:
I got to this method from text-buffer code — specifically subscribeToFile, which my new package apparently called in the course of adding an onDidSave callback to a TextEditor.
It feels like this code path should be hit quite often, and I’m sure I’ve gotten this exact error before, but not for ages; so I definitely don’t have reproduction steps here. But I can assert that hot-fixing this in a debugger (via a conditional breakpoint that simply does encoding = normalizeEncoding(encoding) on the appropriate line) prevents this error, and it makes logical sense why.
…saving the C++ code from receiving a Node-style encoding name that it doesn’t recognize.
Our
TextBuffer
object acts as a bridge betweenTextEditor
and lower-level APIs likepathwatcher
andsuperstring
. Andsuperstring
, in turn, presents a JavaScript API but implements lots of its internals in C++ for speed.But that’s where a bug can slip in, because JavaScript and C++ use different conventional names for common text encodings… including UTF-8, the commonest of all (in our particular domain). Node likes to call it
utf8
, but thesuperstring
C++ code insists upon the more formalUTF-8
. There’s a utility function callednormalizeEncoding
within thesuperstring
JavaScript that converts Node-style encoding names to C++-style encoding names.I’m not going to be able to demonstrate exactly how this bug happens, but take a look at
superstring/index.js
and search for the stringencoding
. You’ll see that, aside from the aforementionednormalizeEncoding
utility, three different methods take anencoding
parameter in some form:TextBuffer#load
,TextBuffer#save
, andTextBuffer#baseTextMatchesFile
.You might also notice that the first two of those methods pass their
encoding
parameter through thenormalizeEncoding
function almost immediately — butbaseTextMatchesFile
doesn’t. After lots of analysis and reflection, I can’t come up with a single reason why it shouldn’t normalize the encoding; it would seem to save me from this exception I kept getting today while working on a new package:I got to this method from
text-buffer
code — specificallysubscribeToFile
, which my new package apparently called in the course of adding anonDidSave
callback to aTextEditor
.It feels like this code path should be hit quite often, and I’m sure I’ve gotten this exact error before, but not for ages; so I definitely don’t have reproduction steps here. But I can assert that hot-fixing this in a debugger (via a conditional breakpoint that simply does
encoding = normalizeEncoding(encoding)
on the appropriate line) prevents this error, and it makes logical sense why.