Support for Browser's File Object (i.e. ArrayBuffers/Uint8Array)

danielberndt commented 1 year ago

Hi there, I'm working in a browser-only context and would like to see if an uploaded file is a text file or not. It seems though that the Buffer argument doesn't work with with the browser-based ArrayBuffer or Uint8Array (probably due to calling the buffer.toString with additional parameters). I could add a Buffer polyfill to my client code, but it would be nice if the native browser types would be supported as well.

danielberndt commented 1 year ago

Just experimented with what needs to be done to use getEncoding with an Uint8Array instead of node's Buffer. And the only change I needed to make was this:

- const contentChunkUTF8 = buffer.toString(textEncoding, chunkBegin, chunkEnd)
+ const contentChunkUTF8 = new TextDecoder().decode(buffer.subarray(chunkBegin, chunkEnd));

Once that this change was made, a browser could read a file's content like this:

const handleChange = (e) => {
  const file = e.target.files[0];
  const reader = new FileReader();
  reader.onload = (e) => {
    console.log(getEncoding(new Uint8Array(e.target.result)));
  };
  reader.readAsArrayBuffer(file);
}

return <input type="file" onChange={handleChange}/>

I'm a bit hesitant to turn this into a PR as it's a fairly big architectural decision on how to support those two types in parallel.

balupton commented 11 months ago

It seems TextDecoder is available in Node.js since Node.js v8: https://nodejs.org/api/util.html#class-utiltextdecoder

https://github.com/bevry/boundation does give us the ability to specify say a browser or worker entry, in case we need to load a different file for the package in that environment. Such is leveraged by: https://github.com/bevry/start-of-week/tree/master/source

bevry / istextorbinary

Support for Browser's File Object (i.e. ArrayBuffers/Uint8Array) #292