Write output to Blob - Githubissues

BenLubar commented 6 years ago

It'd be nice if there was a way to have the output written to a Blob instead of returning a Uint8Array, because browsers are able to put a Blob on disk if it gets too big for memory.

Writing to the end of a blob is straightforward:

myBlob = new Blob([myBlob, addedBytes], {type: 'video/mp4'});

Writing to the middle is a bit more complicated, but still doable:

myBlob = new Blob([myBlob.slice(offset, offset + addedBytes.length), addedBytes, myBlob.slice(offset + addedBytes.length)], {type: 'video/mp4'});

CodeFetch commented 6 years ago

Unfortunately Blobs don't work this way. I added a write function to WORKERFS to test it and found out that a Blob only references other objects like Uint8Arrays. So data doesn't get cached to disk just by using Blobs (personally I think browsers should actually do that)... EDIT: Just read about that 500MB Chrome restriction for Blobs which should have been lifted after version 57 ( https://bugs.chromium.org/p/chromium/issues/detail?id=375297 ). I tested it with Chrome 66 and it seems that something is still broken. The memory usage is stable (so it seems it really caches to disk now), but I didn't manage to return a big Blob to the user. Either nothing happens, the window freezes or Chrome dies. The only possibility I see at the moment to support bigger files is to create a new IDBFS derivate which splits files internally into chunks and synchronizes these chunks with the IDBS allowing you to free references to the unused chunks. I think I'm giving it a try... For downloading one maybe can use e.g. StreamSaver and reassemble the chunks to a stream after processing.

jfizz commented 6 years ago

Hi @CodeFetch, I am looking into the IDBFS solution as well. Have you made any progress?

CodeFetch commented 5 years ago

@jfizz No, the problem is that IDBS only has an asynchronous API, but the read/write functions need to be synchronous. It won't work with asyncify as you can't return anything from an asynchronous library function as far as I know. So you need to use the emterpreter and the whole call stack of read/write functions needs to be emterpreted (added to the emterpretify whitelist) or ffmpeg will be unusably slow. At the moment I don't know how to properly implement asynchronous library functions. We have these pull-requests as examples: https://github.com/kripken/emscripten/pull/6114 https://github.com/kripken/emscripten/pull/3714

goatandsheep commented 4 years ago

This should be handled by the app. What if you don't want it written to blob?

CodeFetch commented 4 years ago

@goatandsheep Blobs are the "correct" way to store big binary data in JavaScript.

The whole reason for this issue was the idea:

because browsers are able to put a Blob on disk if it gets too big for memory.

On some devices the memory for JavaScript is very limited and ffmpeg.js easily runs out of memory. Ben thought blobs would swap the data to disk, but they don't. Happy new year!

goatandsheep commented 4 years ago

For sure! I'm just wondering if it needs to be done in this library specifically. That's a really good point re: memory! I was actually considering trying the library in node which doesn't have blob support

Kagami commented 4 years ago

MEMFS stores all its contents in memory. I don't think you can somehow swap it to disk. If your output file is too big to be stored in memory, that indeed might be a problem. See related issue #68 in that case.

Kagami / ffmpeg.js

Write output to Blob #54