w3c / FileAPI

File API
https://w3c.github.io/FileAPI/
Other
104 stars 44 forks source link

Why are Blob() and File() constructors at Chromium 81 throwing RangeError? #147

Closed guest271314 closed 4 years ago

guest271314 commented 4 years ago

The specification does not mention RangeError once, yet when testing video.requestAnimationFrame() found that images converted to Blobs then Uint8Arrays (with createImageBitmap() and OffscreenCanvas.convertToBlob()) and stored in an Array when that Array is passed to Blob() or File() constructors a RangeError is consistently thrown

RangeError: Invalid string length
            Array.join (<anonymous>)
            Array.toString (<anonymous>)

plnkr to demonstrate https://plnkr.co/edit/NVTsnAjSTf0h1qYy?preview, Chromium bug https://bugs.chromium.org/p/chromium/issues/detail?id=1063681.

This is the first time that have observed such behaviour when using File API. What is the cause of the RangeError being thrown in this case?

inexorabletash commented 4 years ago

A quick glance at your code in the crbug - are you accidentally passing an array of arrays in to the constructor, rather than just an array? That might be triggering an implicit string conversion in the bindings code.

(On my phone so can't try it out right now. Apologies if I misread.)

guest271314 commented 4 years ago

An Array containing Uint8Arrays is being passed to the constructor, e.g.,

const arr = [];
arr.push(new Uint8Array([1]), new Uint8Array([2]));
const blob =  new Blob([arr]);
const file = new File([arr], 'file');
blob.stream().getReader().read().then(({value, done}) => {
  console.assert(value.length === 3, value);
  console.log(new TextDecoder().decode(value)); // "1,2"
});
file.stream().getReader().read().then(({value, done}) => {
  console.assert(value.length === 3, value);
  console.log(value); // [49, 44, 50]
});

however, the resulting length of the input Uint8Arrays are substantially greater than 3 total, where the values are derived from OffscreenCanvas.convertToBlob({type:"image/webp"}), then Blob.arrayBuffer() is used where the result is passed to Uint8Array(). Those individual Uint8Arrays are push()ed into a plain JavaScript Array which is then passed to Blob() and File() constructors, respectively, in each case including [result<Array>] within constructor. Have not observed that particular behaviour in the past when using File or Blob constructors. The result should be consistent with the result of MediaRecorder.ondataavailable event being fired multiple times (or any other concatenation of Array, TypedArray, Blob or File), each event.data<Blob> being push()ed to an Array and the array being passed to Blob constructor.

Tested at plnkr, jsfiddle and at file: protocol before filing the issue to make sure the error was not based on the code that implements the online editor.

Have not previously encountered this particular bug re File API.

guest271314 commented 4 years ago

Randomly picked Uint8Array lengths from the input Array having length 660:

570404 361196

Interestingly when running the commented code below, removing the .then() which returns new Uint8Array(<ArrayBuffer>) to try to isolate Uint8Array being the cause due to the length of the resulting Uint8Array, given how the Uint8Array is merged into a single Uint8Array at code at the previous comment, Promise.all() does not appear to ever settle.

     .then(blob => {
        return blob.arrayBuffer();
      })
      /*
      .then(ab => {
        return new Uint8Array(ab);
      })
      */
guest271314 commented 4 years ago

A quick glance at your code in the crbug - are you accidentally passing an array of arrays in to the constructor, rather than just an array? That might be triggering an implicit string conversion in the bindings code.

Your observation appears to be correct for Blob(), where an Array of Promises is passed to

const arr = [];
function  doStuff() {
   arr.push(new Promise(resolve =>  resolve('value')))
}
Promise.all(arr).then(result => {
  let blob, file;
  try {
    blob = new Blob(result); // without `[]` which File requires `[]` within constructor
      console.log({blob}); // blob is logged
    } catch (e) {
      console.error(e);
    }
})

Initially loaded plnkr with refresh automatically enabled, to rule out that being the case for Promise.all() not settling, tried again, and the result, with Array containing ArrayBuffers, not Uint8Arrays, using File is that the {file} is not logged at all

const arr = [];
function  doStuff() {
   arr.push(new Promise(resolve =>  resolve('value')))
}
// never settled?
Promise.all(arr).then(result => {
  let blob, file;
  try {
   // never logged
    file = new File([result], 'file'); // File requires `[]` within constructor
      console.log({file}); // file is not logged
    } catch (e) {
      console.error(e);
    }
})
guest271314 commented 4 years ago

@inexorabletash

A quick glance at your code in the crbug - are you accidentally passing an array of arrays in to the constructor, rather than just an array? That might be triggering an implicit string conversion in the bindings code.

That is it re File API issue.

Did not File() constructor once mandate [] be explicitly passed?

It is now apparently possible to do

var arr = [1];
var file = new File(arr, 'fileName');

which if recollect correctly was not always possible?

Yet and still if File or Blob constructor does have the potential to throw RangeError should that be mentioned in the specification?

Or, should this issue be closed?

inexorabletash commented 4 years ago

Passing something convertible to a sequence (like an Array) was always required. Whether you do that with a literal array constructor or a variable is a language issue, outside the domain of the spec.

It looks like the issue was the browser's JS implementation rejecting a conversion from array to string due to the length of the string. You'd need to check the ECMAScript spec or WebIDL spec to see if throwing there is defined (or not forbidden). It happens "before" the spec algorithms themselves are reached, so similarly is not something the spec should mention.

guest271314 commented 4 years ago

Some observations when Blob is passed as the innermost element of an array of arrays

var MY_JSON_FILE = [new Blob([1])];

var blob = new Blob(MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob);

logs the string

1

var MY_JSON_FILE = [[new  Blob([1])]];

var blob = new Blob(MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob);

and

var MY_JSON_FILE = [[new  Blob([1])]];

var blob = new Blob(MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(new TextDecoder().decode(new Uint8Array(e.target.result)))
});

fr.readAsArrayBuffer(blob);

logs string

[object Blob]

var MY_JSON_FILE = [[[`{
  "hello": "world"
}`]]];

var blob = new Blob([[MY_JSON_FILE]]);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob);

and

var MY_JSON_FILE = [`{
  "hello": "world"
}`];

var blob = new Blob(MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob);

logs

{
  "hello": "world"
}

for an array of arrays

var blob = new Blob(..."1");

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(new TextDecoder().decode(new Uint8Array(e.target.result)))
});

fr.readAsArrayBuffer(blob);

logs

Uncaught TypeError: Failed to construct 'Blob': The provided value cannot be converted to a sequence.

var MY_JSON_FILE = [new  Blob([1])];

var blob = new Blob(...MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(new TextDecoder().decode(new Uint8Array(e.target.result)))
});

fr.readAsArrayBuffer(blob);

logs

Uncaught TypeError: Failed to construct 'Blob': The object must have a callable @@iterator property.

var MY_JSON_FILE = [[new  Blob([1]), new Blob([2])]];

var blob = new Blob(MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob);

logs

[object Blob],[object Blob]

in this case for array of arrays one solution is, per https://stackoverflow.com/a/37152508

fun(a, b, ...c): This construct doesn't actually have a name in the spec. But it works very similar as spread elements do: It expands an iterable into the list of arguments. It would be equivalent to func.apply(null, [a, b].concat(c)).

where

var MY_JSON_FILE = [[new  Blob([1]), new Blob([2])]];

var blob = new Blob(...MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob)

logs

12

if the N nested arrays beyond 2 are not flattened

var flatten  = arr => {
  arr = arr.flat();
  return arr.find(a => Array.isArray(a)) ? flatten(arr) : arr
}

var MY_JSON_FILE = [[[new  Blob([1]), new Blob([2])]], 3, ["abc"]];

var blob = new Blob(flatten(MY_JSON_FILE));

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(new TextDecoder().decode(new Uint8Array(e.target.result)))
});

fr.readAsArrayBuffer(blob);
var MY_JSON_FILE = [[[new  Blob([1]), new Blob([2])]]];

var blob = new Blob(...MY_JSON_FILE);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(new TextDecoder().decode(new Uint8Array(e.target.result)))
});

fr.readAsArrayBuffer(blob);

the output is

[object Blob],[object Blob]

again.

guest271314 commented 4 years ago

@inexorabletash

Another example where using [] or not has a substantial impact on the result, even where the input is a TypedArray

var floats = new Float32Array([0.00005549501292989589, 0.00006459458381868899, 0.000058644378441385925, 0.00006201512587722391]);
var file = new Blob(floats); // pass TypedArray
file.arrayBuffer().then(b =>  console.log(new Float32Array(b))).catch(console.error);
// Chromium error message
RangeError: byte length of Float32Array should be a multiple of 4
    at new Float32Array (<anonymous>)
// Nightly error message
RangeError: "attempting to construct out-of-bounds TypedArray on ArrayBuffer"
// why?
file.text().then(console.log).catch(console.error);
Promise {<pending>}
0.000055495012929895890.000064594583818688990.0000586443784413859250.00006201512587722391

compare

var floats = new Float32Array([0.00005549501292989589, 0.00006459458381868899, 0.000058644378441385925, 0.00006201512587722391]);
var file = new Blob([floats]); // pass TypedArray within []
file.arrayBuffer().then(b =>  console.log(new Float32Array(b))).catch(console.error);
Promise {<pending>}
Float32Array(4) [0.00005549501292989589, 0.00006459458381868899, 0.000058644378441385925, 0.00006201512587722391]
// why?
file.text().then(console.log).catch(console.error);
Promise {<pending>}
Q�h8�v�8��u8�8

In this case the File API specification refers to

4.2. BufferSource

typedef (ArrayBufferView or ArrayBuffer) BufferSource;

The BufferSource typedef is used to represent objects that are either themselves an ArrayBuffer or which provide a view on to an ArrayBuffer.

Emphasis added above beginning at "or".

Is a TypedArray already a sequence? Why is a TypedArray being converted to a string when [] is omitted from Blob constructor at new Blob(TypedArray)?