pulsar-edit / superstring

Native core components for Pulsar
MIT License
1 stars 5 forks source link

Fix an un-normalized encoding… #9

Open savetheclocktower opened 5 months ago

savetheclocktower commented 5 months ago

…saving the C++ code from receiving a Node-style encoding name that it doesn’t recognize.


Our TextBuffer object acts as a bridge between TextEditor and lower-level APIs like pathwatcher and superstring. And superstring, in turn, presents a JavaScript API but implements lots of its internals in C++ for speed.

But that’s where a bug can slip in, because JavaScript and C++ use different conventional names for common text encodings… including UTF-8, the commonest of all (in our particular domain). Node likes to call it utf8, but the superstring C++ code insists upon the more formal UTF-8. There’s a utility function called normalizeEncoding within the superstring JavaScript that converts Node-style encoding names to C++-style encoding names.

I’m not going to be able to demonstrate exactly how this bug happens, but take a look at superstring/index.js and search for the string encoding. You’ll see that, aside from the aforementioned normalizeEncoding utility, three different methods take an encoding parameter in some form: TextBuffer#load, TextBuffer#save, and TextBuffer#baseTextMatchesFile.

You might also notice that the first two of those methods pass their encoding parameter through the normalizeEncoding function almost immediately — but baseTextMatchesFile doesn’t. After lots of analysis and reflection, I can’t come up with a single reason why it shouldn’t normalize the encoding; it would seem to save me from this exception I kept getting today while working on a new package:

Screenshot 2024-05-21 at 7 21 06 PM

I got to this method from text-buffer code — specifically subscribeToFile, which my new package apparently called in the course of adding an onDidSave callback to a TextEditor.

It feels like this code path should be hit quite often, and I’m sure I’ve gotten this exact error before, but not for ages; so I definitely don’t have reproduction steps here. But I can assert that hot-fixing this in a debugger (via a conditional breakpoint that simply does encoding = normalizeEncoding(encoding) on the appropriate line) prevents this error, and it makes logical sense why.

savetheclocktower commented 5 months ago

I am… not surprised that CI feels dusty here.