Closed dobesv closed 4 months ago
Based on the docs, passing the encoding
option changes the output from buffer to string. Can you better describe what has changed in node 20.12 ?
To be honest I don't know what exactly they changed in node v20.12. I couldn't figure it out from their release notes.
However, by switching versions and running in the debugger I found that in the earlier version, the encoding
parameter to this function would be set to utf-8
but in node v20.12 and later the encoding
was undefined
.
I suppose that seems right, in a way, since the stream is in object mode there's no reason for it to have an encoding
.
If you look a few lines up from my change you can see that content
is expected to be a string
here, e.g.
content = file.contents.toString('utf8')
When it is not a string, I get errors in the typescript parser:
s.codePointAt is not a function
TypeError: s.codePointAt is not a function
at codePointAt (xxx/.yarn/cache/typescript-patch-32ada147aa-5659316360.zip/node_modules/typescript/lib/typescript.js:12414:81)
at Object.scan (xxx/.yarn/cache/typescript-patch-32ada147aa-5659316360.zip/node_modules/typescript/lib/typescript.js:11509:26)
That's because s
was passed as a Buffer
instead of a string
, and Buffer
doesn't have a codePointAt
method on it. By ensuring that we always provide an encoding, we ensure that s
is a string
and thus it will implement the interface expected by the TypeScript parser.
As I mentioned in the description I suspect the change maybe should be:
content = fs.readFileSync(file.path, 'utf8')
I'm not sure why exactly the incoming encoding is relevant to the source file encoding. But maybe there's some use case for that, like in gulp or something. I don't know.
This is now part of 9.0.1
. Thanks!
The encoding set on the transform is technically supposed to be used for the chunks going through the transform itself, and it's not necessarily set for a Transform, especially in object mode. This falls back on
utf-8
as the encoding when loading a file from disk.I suspect this should just always use
utf-8
always but I can't say for sure there isn't a use case for using theencoding
parameter.