Yahweasel / libav.js

This is a compilation of the libraries associated with handling audio and video in ffmpeg—libavformat, libavcodec, libavfilter, libavutil, libswresample, and libswscale—for emscripten, and thus the web.
288 stars 18 forks source link

Problems of importing libav.js with bundler #20

Closed martenrichter closed 1 year ago

martenrichter commented 1 year ago

Hi, I have just switched my bundler from webpack to vite. To finally be able to integrate your polyfills. Unfortunately, also the new bundler does not like libav.js. I can probably workaround it by copying your libav.in.js code to the main app. But may be you like to make your package compatible with bundlers?

I see two ways, either a callback, which is supplied to libav.js for loading stuff. Or using something like (path had to be adjusted: var toImport = new URL("../node_modules/libav.js/libav-${verstring}.${target}.js", import.meta.url); which may make a lot of bundlers happy.

Yahweasel commented 1 year ago

I am not opposed to making libav.js more bundleable, but it has some unique complexities, which is why I haven't done it myself. I'm open to PRs, but I'm not familiar enough with bundlers to be willing to make such changes myself. Particular concerns are:

martenrichter commented 1 year ago

My problem with making a PR is that I failed to set up an environment in a container for compiling the libav.js. Now, I am hacking together a copy of libav.js.in which I change everything by hand to make a bundle version. But this is not a clean way. Since I like to participate in the prebuilt stuff (since Apple now includes WebCodecs video support, I personally only need Opus).

I think I understand after the hours I have spent in libav.js.in very well the logic in libav.js.in, with some exceptions. The easiest way to make it more bundle able is probably leaving the difficulty to the integrator.

My idea would be that one can pass a callback (one example): let callback = ({target, verConfig, type}) => { if (type === 'worker') { return new Worker(new URL(`../node_modules/libav.js/libav-${target}-${verConfig}.${targ}.js`, import.meta.url ) } else if (type === 'importScripts') { ..... } } Someone who wants to use your code can just inject what is necessary for the bundling solution. And if the callback is not defined, you use your standard import code.

And yes the global libAV is very challenging with modules, but a globalThis.LibAV = {} before the dynamic import() addresses the problem.

martenrichter commented 1 year ago

And the callback should be async, so returing a promise. I have just discovered that importScripts needs to be replaced with import as well.

martenrichter commented 1 year ago

OK, now, I was able to load the worker, but loading the wasm fails now. That means the whole callback idea is probably only getting it halfway through.

martenrichter commented 1 year ago

Ok, the callback way should work, since one can tell the bundler also to include the wasm. It should be enough. No, it does not.....

Yahweasel commented 1 year ago

OK, so let me just attempt to clarify. All that you ought to need is a way of getting my toImport generated by a callback passed to LibAV, instead of or in addition to having it baked in? That seems quite reasonable, and is actually a great solution to the integration problem (insofar as it kicks it down the line to still not being me :rofl: ). I'll implement the callback style.

martenrichter commented 1 year ago

No wait. Unfortunately, it is not that easy. The problem is that for example, the wasm.js files also load the wasm.wasm files in the worker, and so far I did not find a way how one can kick in there.... Anyway, I will continue today working with copies of your code and finding the positions, where one need to alter the behaviour and then we have to rethink, if it is possible.

Btww. I have found audio decoding libs, which solve the problems by importing the wasm files into the js files. But I am afraid, this is not easily automated into your build process.

martenrichter commented 1 year ago

Ok, I think I have identified all places, where it would be necessary to inject something via callback. First, one is in the libav-3.11.5.1.2-opus.js js file, which is generated by a template, I presume. Here is an excerpt of working code, where the replacements should be triggered with one or more callbacks:

    opts = opts || {}
    const targ = target(opts)
    let ret

    let url
    // The next logic should be handled with a callback so that the url = .... is actually not here
   // The code may look a bit clumsy, but the bundler requires static strings in order to know what to do.
   // but this would be done inside a callback anyway, just supplying the wasm, asm, simd or thr
    if (targ === 'wasm') {
      url = new URL('./libav-3.11.5.1.2-opus.wasm.js', import.meta.url)
    } else if (targ === 'asm') {
      url = new URL('./libav-3.11.5.1.2-opus.asm.js', import.meta.url)
    } else if (targ === 'simd') {
      url = new URL('./libav-3.11.5.1.2-opus.simd.js', import.meta.url)
    }

    return Promise.all([])
      .then(function () {
        // Step one: Get LibAV loaded
        if (!libav.LibAVFactory) {
          if (nodejs) {
            // Node.js: Load LibAV now I did not test this, normally no one uses a bundler
            libav.LibAVFactory = require(url)
          } else if (typeof Worker !== 'undefined' && !opts.noworker) {
            // Worker: Nothing to load now
          } else if (typeof importScripts !== 'undefined') {
            // Worker scope. Import it.
           // NOTE some bundlers, like vite convert your file on the fly to a module, where importScripts is present but throws an exception
            if (this !== undefined) {
              // eslint-disable-next-line no-undef
              importScripts(url)
            } else {
              let imported
              // The next Lines should go again to a callback, but I am not completely sure if it will work.
              // any way the import and the argument should go along side.
              if (targ === 'wasm') {
                // eslint-disable-next-line no-undef
                imported = import('./libav-3.11.5.1.2-opus.wasm.js')
              } else if (targ === 'asm') {
                imported = import('./libav-3.11.5.1.2-opus.asm.js')
              } else if (targ === 'simd') {
                imported = import('./libav-3.11.5.1.2-opus.simd.js')
              }
              return imported.then((mod) => {
               // The problem is that the wasm.js is also converted to a module, so no exports, and I think you can not also add
              //an export default? , so we need to  insert a line in the loaded file to attach the factory to the globalThis object
                libav.LibAVFactory = globalThis.LibAVFactory
              })
            }
          } else {
            // Web: Load the script
            return new Promise(function (resolve, reject) {
              const scr = document.createElement('script')
              scr.src = url
              scr.addEventListener('load', resolve)
              scr.addEventListener('error', reject)
              scr.async = true
              document.body.appendChild(scr)
            }).then(function () {
              libav.LibAVFactory = globalThis.LibAVFactory
            })
          }
        }
      })
      .then(function () {
        // Step two: Create the underlying instance
        if (!nodejs && typeof Worker !== 'undefined' && !opts.noworker) {
          // Worker thread
          ret = {}

          // Load the worker
         // NOTE just use the URL
          ret.worker = new Worker(url)

Ok, this one is pretty easy, now, we have to go to the wasm.js file: var wasmBinaryFile

   // the URL string can be passed to the worker via a message and also coming from a callback?
   // or another way to import user supplied code
   // but I am not sure if this code is actually generated by emscripten, maybe there is an option to inline the code
    wasmBinaryFile = new URL('libav-3.11.5.1.2-opus.simd.wasm', import.meta.url).toString()
    // I had to comment the code, since it was prepending the url again.,
    /* if (!isDataURI(wasmBinaryFile)) {
      wasmBinaryFile = locateFile(wasmBinaryFile)
    } */

Furthermore, I had to add

if (typeof exports === 'object' && typeof module === 'object')
  module.exports = LibAVFactory
else if (typeof define === 'function' && define['amd'])
  define([], function () {
    return LibAVFactory
  })
else if (typeof exports === 'object') exports['LibAVFactory'] = LibAVFactory
else globalThis.LibAVFactory = LibAVFactory // this code attaches in the module cases the Factory to the globalThis

With these changes, everything seems to load. I can not tell yet, if everything is working, since I ran into a problem in Polyfill, since I load it inside a worker, I will supply a fix in a PR soon.

Yahweasel commented 1 year ago

The .wasm.js file is generated by Emscripten, and how it imports stuff is largely immutable.

martenrichter commented 1 year ago

This is what I feared. The last change should not be a problem, since I saw your copyright notice right behind it, and it could be added there as well. The wasmBinaryFile thing is the real problem.

Yahweasel commented 1 year ago

I'm closing this until and unless you get emsdk working and compile libav.js yourself. You cannot usefully contribute here by poking around blind.

martenrichter commented 1 year ago

Well, I think came far enough, I do not think, that compiling it myself is necessary. ` I just have to understand, how emscripten puts the files together. If I understand the docs correctlyextern-post.js` was just appended to the build wasm.js files. I have replaced this part in the build with a modified code, that contains some logic to receive the url either through the messagepipe of the worker or via the libAV object:

if (typeof LibAV !== 'undefined' && LibAV.wasmurl){
  globalThis.LibAVFactoryAsync = new Promise((resolve, reject) => {   
  const initialFactory = LibAVFactory
  fetch(LibAV.wasmurl).then((response) => {
    if (!response['ok']) {
      throw (
        "failed to load wasm binary file at '" + LibAV.wasmurl + "'"
      )
    }
    return response['arrayBuffer']()
  }).then((binary) => {
      resolve (() => { 
      return initialFactory({ wasmBinary: binary })
     })
    }).catch((error)=> reject(error))
  })
}

if (
  typeof importScripts !== 'undefined' &&
  (typeof LibAV === 'undefined' || !LibAV.nolibavworker)
) {
  // We're a WebWorker, so arrange messages
  const loadLibAV = async (wasmurl) => {
    try {
   const response = await fetch(wasmurl)
    if (!response['ok']) {
      throw (
        "failed to load wasm binary file at '" + wasmurl + "'"
      )
    }
    return await LibAVFactory({ wasmBinary: await response['arrayBuffer']()})
  } catch(error) {
    throw error
  }
}   
    let libav
    onmessage = async function (e) {
      if (e?.data?.wasmurl) {
     try {

          libav = await loadLibAV(e.data.wasmurl)
          libav.onwrite = function (name, pos, buf) {
            /* We have to buf.slice(0) so we don't duplicate the entire heap just
             * to get one part of it in postMessage */
            postMessage(['onwrite', 'onwrite', true, [name, pos, buf.slice(0)]])
          }
          postMessage(['onready', 'onready', true, null])
        } catch (ex) {  
          console.log('Loading LibAV failed' + '\n' + ex.stack)
        }
        return
    } 
      var id = e.data[0]
      var fun = e.data[1]
      var args = e.data.slice(2)
      var ret = void 0
      var succ = true
      try {
        ret = libav[fun].apply(libav, args)
      } catch (ex) {
        succ = false
        ret = ex.toString() + '\n' + ex.stack
      }
      if (succ && typeof ret === 'object' && ret !== null && ret.then) {
        // Let the promise resolve
        ret
          .then(function (res) {
            ret = res
          })
          .catch(function (ex) {
            succ = false
            ret = ex.toString() + '\n' + ex.stack
          })
          .then(function () {
            postMessage([id, fun, succ, ret])
          })
      } else {
        postMessage([id, fun, succ, ret])
      }
    }
}

The keypoint was, that emscripten provides a mechanism to get a ArrayBuffer of the Wasm, so just a custom fetching algorithm in the post code needed to be added. Then the second part in the libav.in.js code:

 // Now start making our instance generating function
  libav.LibAV = function (opts) {
    opts = opts || {}
    const targ = target(opts)
    let ret

    let url
    // THE FOLLOWING CODE SHOULD BE PROVIDED BY A CALLBACK
    if (targ === 'wasm') {
      url = new URL('./libav-3.11.5.1.2-opus.wasm.js', import.meta.url)
      globalThis.LibAV.wasmurl = new URL(
        'libav-3.11.5.1.2-opus.wasm.wasm',
        import.meta.url
      )
    } else if (targ === 'asm') {
      url = new URL('./libav-3.11.5.1.2-opus.asm.js', import.meta.url)
      globalThis.LibAV.wasmurl = new URL(
        'libav-3.11.5.1.2-opus.asm.wasm',
        import.meta.url
      )
    } else if (targ === 'simd') {
      url = new URL('./libav-3.11.5.1.2-opus.simd.js', import.meta.url)
      globalThis.LibAV.wasmurl = new URL(
        'libav-3.11.5.1.2-opus.simd.wasm',
        import.meta.url
      )
    }

    return Promise.all([])
      .then(function () {
        // Step one: Get LibAV loaded
        if (!libav.LibAVFactory) {
          if (nodejs) {
            // Node.js: Load LibAV now
            libav.LibAVFactory = require(url)
          } else if (typeof Worker !== 'undefined' && !opts.noworker) {
            // Worker: Nothing to load now
          } else if (typeof importScripts !== 'undefined') {
            // Worker scope. Import it.
            if (this !== undefined) {
              // eslint-disable-next-line no-undef
              importScripts(url)
            } else {
              let imported
              // THE FOLLOWING STUFF SHOULD BE GENERATED BY A CALLBACK
              if (targ === 'wasm') {
                // eslint-disable-next-line no-undef
                imported = import('./libav-3.11.5.1.2-opus.wasm.js')
              } else if (targ === 'asm') {
                imported = import('./libav-3.11.5.1.2-opus.asm.js')
              } else if (targ === 'simd') {
                imported = import('./libav-3.11.5.1.2-opus.simd.js')
              }
              return imported
                .then((mod) => {
                  console.log(
                    'Imported branch',
                    mod,
                    globalThis,
                    mod.default,
                    libav.LibAVFactory,
                    globalThis.LibAVFactory
                  )
                  if (globalThis.LibAVFactoryAsync)
                    return globalThis.LibAVFactoryAsync
                  else throw new Error('No LibAVFactoryAsync')
                })
                .then((LibAVFactory) => {
                  libav.LibAVFactory = LibAVFactory
                  console.log(
                    'Assign factory',
                    LibAVFactory,
                    libav.LibAVFactory
                  )
                })
                .catch((error) => {
                  console.log('Error loading libAV', error)
                })
            }
          } else {
            // Web: Load the script
            return new Promise(function (resolve, reject) {
              const scr = document.createElement('script')
              scr.src = url
              scr.addEventListener('load', resolve)
              scr.addEventListener('error', reject)
              scr.async = true
              document.body.appendChild(scr)
            }).then(function () {
              libav.LibAVFactory = globalThis.LibAVFactory
            })
          }
        }
      })
      .then(function () {
        // Step two: Create the underlying instance
        if (!nodejs && typeof Worker !== 'undefined' /*&& !opts.noworker*/) { // the noworker logic, hijects the handlers if loaded inside a worker
          // Worker thread
          ret = {}

          // Load the worker
          ret.worker = new Worker(url)

          ret.worker.postMessage({
            wasmurl: globalThis.LibAV.wasmurl.toString()
          })

          // Report our readiness
          return new Promise(function (resolve, reject) {
            // Our handlers
            ret.on = 1
            ret.handlers = {
              onready: [
                function () {
                  resolve()
                },
                null
              ],
              onwrite: [
                function (args) {
                  if (ret.onwrite) ret.onwrite.apply(ret, args)
                },
                null
              ]
            }

            // And passthru functions
            ret.c = function () {
              let msg = Array.prototype.slice.call(arguments)
              return new Promise(function (resolve, reject) {
                const id = ret.on++
                msg = [id].concat(msg)
                ret.handlers[id] = [resolve, reject]
                ret.worker.postMessage(msg)
              })
            }
            function onworkermessage(e) {
              const id = e.data[0]
              const h = ret.handlers[id]
              if (h) {
                if (e.data[2]) h[0](e.data[3])
                else h[1](e.data[3])
                if (typeof id === 'number') delete ret.handlers[id]
              }
            }
            ret.worker.onmessage = onworkermessage

            // And termination
            ret.terminate = function () {
              ret.worker.terminate()
            }
          })
        } else {
          // Not Workers
          // Start with a real instance
          return Promise.all([])
            .then(function () {
              // Annoyingly, Emscripten's "Promise" isn't really a Promise
              return new Promise(function (resolve) {
                console.log('Peak libAV', libav, libav.LibAVFactory)
                libav
                  .LibAVFactory()
                  .then(function (x) {
                    delete x.then
                    resolve(x)
                  })
                  .catch((error) => {
                    console.log('libAV: problem', error)
                  })
              })
            })
            .then(function (x) {
              ret = x
              ret.worker = false

              // Simple wrappers
              ret.c = function (func) {
                const args = Array.prototype.slice.call(arguments, 1)
                return new Promise(function (resolve, reject) {
                  try {
                    resolve(ret[func].apply(ret, args))
                  } catch (ex) {
                    reject(ex)
                  }
                })
              }

              // No termination
              ret.terminate = function () {}
            })
        }
      })

So far following the code in the browser debugger shows, that everything in my test setup is loaded. Only the mechanism of the callback have to be added, but this interface should be designed, how you like it.

martenrichter commented 1 year ago

Ok I am trying to setup a build environment.

Any hints what happens, if

ERROR: opus not found using pkg-config

If you think configure made a mistake, make sure you are using the latest
version from Git.  If the latest version fails, report the problem to the
ffmpeg-user@ffmpeg.org mailing list or IRC #ffmpeg on irc.libera.chat.
Include the log file "ffbuild/config.log" produced by configure as this will help
solve the problem.
emconfigure: error: 'env PKG_CONFIG_PATH=/workspaces/libav.js/build/inst/base/lib/pkgconfig ../configure --prefix=/opt/ffmpeg --target-os=linux --cc=emcc --ranlib=emranlib --disable-doc --disable-stripping --disable-programs --disable-ffplay --disable-ffprobe --disable-network --disable-iconv --disable-xlib --disable-sdl2 --disable-everything --disable-pthreads --arch=emscripten "--extra-cflags=-I/workspaces/libav.js/build/inst/base/include " "--extra-ldflags=-L/workspaces/libav.js/build/inst/base/lib " --enable-protocol=data --enable-protocol=file --enable-filter=aresample --enable-demuxer=ogg --enable-muxer=ogg --enable-demuxer=matroska --enable-muxer=matroska --enable-muxer=webm --enable-libopus --enable-decoder=libopus --enable-encoder=libopus --enable-demuxer=mov --enable-muxer=ipod --enable-decoder=aac --enable-encoder=aac --enable-demuxer=flac --enable-muxer=flac --enable-decoder=flac --enable-encoder=flac --enable-decoder=pcm_s16le --enable-decoder=pcm_s24le --enable-demuxer=wav --enable-encoder=pcm_s16le --enable-encoder=pcm_s24le --enable-muxer=wav --enable-filter=acompressor --enable-filter=adeclick --enable-filter=adeclip --enable-filter=aecho --enable-filter=afade --enable-filter=aformat --enable-filter=agate --enable-filter=alimiter --enable-filter=amix --enable-filter=apad --enable-filter=atempo --enable-filter=atrim --enable-filter=bandpass --enable-filter=bandreject --enable-filter=dynaudnorm --enable-filter=equalizer --enable-filter=loudnorm --enable-filter=pan --enable-filter=amix --enable-filter=volume --enable-filter=anull' failed (returned 1)
make: *** [mk/ffmpeg.mk:32: build/ffmpeg-5.1.2/build-base-default/ffbuild/config.mak] Error 1

I have just checked the source out and running it in the emsdk docker container. EDIT: Probably pkgconfig is missing in the container