gpac / mp4box.js

JavaScript version of GPAC's MP4Box tool
https://gpac.github.io/mp4box.js/
BSD 3-Clause "New" or "Revised" License
1.92k stars 325 forks source link

Modification of MP4Box.js development policy #304

Closed 1kilogram closed 1 year ago

1kilogram commented 1 year ago

Modification of MP4Box.js development policy

The MP4Box.js library is required to work with binary media data without using (e.g. after segmentation as specified in README) the following time consuming browser APIs: NO Web Codecs API, NO Media Recorder API, NO Media Source Extensions API (never MSE). This development policy must follow because of a commercial mistake by Apple inc. and the cretinism (this is not a nickname, but a diagnosis) of Apple inc. executives.

The main problem is that Apple to keep App Store application market, programmatically blocks some features of different API, deliberately breaking W3C specifications.

Take the great example that led me to ffmpeg.js, and later to MP4Box.js:

So we have: Web Audio API, which is available from iOS6+, cHr14+, fFox25+ that is for devices from 2011 - 2013 onwards (updates). The idea is: the ability to send voice messages without using getUserMedia API (available only with iOS 11+, cHr53+, fFox36+ starting from 2017 release/update)

How to do it:

  1. record/load video from the device (using input type=file [capture=user {for mobile cameras}])
  2. separate audio from video
  3. send

So according to the Web Audio API specification this is quite a simple procedure, which is done with the function audioContext.decodeAudioData(). This works everywhere except iOS16+ and will never work because of the cretinism of Apple inc management. I'm not saying Apple's management is bad, as as there are things to learn from Google (Chrome, Firefox), for example unlike Google, Apple follows backward compatibility principles.

For example Application Cache (iOS 3.2+) is available on modern devices iOS16+, not only via HTTPS, but also via HTTP protocol! Which is very convenient, for example, for artists, who have nothing to hide and no time to do programming (add the file "map-files.cache" and everything works).

So, audioContext.decodeAudioData() on iOS is programmatically blocked, and only for video files. Safari handles audio files normally, just like Chrome or FireFox.

We look for a workaround and find it: audioContext.createScriptProcessor(), which, in the first tests immediately proved itself and recorded an "oscillator" audioContext.createOscillator() via "offline Audio Context". And it all worked on iOS! But... When trying to record audio with video: nothing happens.

I'm sure there's a trigger on iOS that blocks this kind of action with video, otherwise it opened up a huge opportunity, which would have made a lot of apps just not have the App Store, as they would have been based in the public domain for all devices a long time ago.

Brilliant solution: FFmpeg.js and MP4Box.js

The Web Audio API study took place between October 22 and October 31 of this year. And then on November 6, I came across ffmpeg.js. Imagine my surprise when I experimented with ffmpeg.js and found out that the task of separating audio track from 10-minute FullHD video amounts to less than 2 seconds!

Yes, we go out of our way to save the audio in a .mp4 container, but we have an advantage in this action:

  1. The mp4 container is already compressed and doesn't require any transcoding to compress to mp3 and the difference in their sizes is negligible.
  2. any browser, even iOS which does not read all kinds of audio, can read the .mp4 container (iOS 4+, cHr 4+, fFox 3.5+)

This is the solution. Brilliant solution, but there is only one catch: ffmpeg.js library takes 26 mb and will take very, very long to download on a 3g connection. This is a powerful tool that takes a lot of memory, and despite running even on iOS9+ does not perform its tasks, as Safari just collapsed from the attached file of 26 mb! But on iOS11+ everything works fine. With Web Worker you can unload a worker process, which allows you to free up memory if the user doesn't need ffmpeg.js features anymore.

Let's look at MSE (Media Source Extensions API) yes, following Google's policy - everything works fine, but the cretins (diagnosis, not calling them names!) from Apple have distinguished themselves again, quote:

"Fully supported only in iPadOS, 13 and later."

The elementary mass of tasks that iOS does in ffmpeg.js in a matter of minutes and literally seconds (if the solution is intelligently put) all of this is simply not available to the iOS MSE API! How is that possible? Obviously, this is a commercial mistake of Apple inc. which can be bypassed after all...

And the best option is MP4Box.js! library which I accidentally came across only on November 10! MP4Box.js is a mechanism which has huge advantages over ffmpeg.js:

  1. it's small size of full library and...
  2. ...and high speed of preprocessing by system

Regarding iOS specifics: translating .MOV container to .MP4 container is a lightweight procedure which takes less than 2kb of code and needs FileReader API and Blob and Uint8Array (iOS6+). This is a feature of iOS and does not apply to other systems, for example Android usually records camera video to an mp4 file and we do not need to translate it because we assume that the media file from which you want to remove video and leave the audio will be in a .mov or .mp4 container in this case.

Now I hope I've convinced the MP4Box.js community and people involved in development that

we need to make the MP4Box.js development policy progressive:

1kilogram commented 1 year ago

?

cconcolato commented 1 year ago

the MP4Box.js library should work out of the box

That's always a goal, but different people have different use cases and the library may not work out of the box for all use cases. On top of that, bugs can make it harder to use.

the processing of .mp4 files within the MP4Box concept must work without dependence on modern API which in some cases instead of possibilities add limitations

MP4Box.js is independent of the APIs you listed. It can be used with them or not.

I'm closing this issue as I don't see any concrete actions from it. If you have specific bugs or suggestions to improve the code, feel free to submit them and submit patches (this is open source software).