Is VideoWorker faster than FrameGrabber + Worker?

ds-hwang commented 9 years ago

Hi, I'm fan of this spec but I have a question and I hope this spec covers main thread also. FrameGrabber in Media Capture Depth Stream Extensions has same purpose of VideoWorker except for web worker. We can mimic VideoWorker using FrameGrabber and Worker

Main script:

var worker = new Worker('video_worker.js');

navigator.mediaDevices.getUserMedia({
  video: true
}).then(function (mediaStream) {
  // mediaStream carries both video stream track and depth stream track
  var frameGrabber = new FrameGrabber(mediaStream);
  if (frameGrabber) {
    frameGrabber.start(processFrameData);
  }
});

function processFrameData(frameImage) {
 worker.postMessage({'cmd': 'videoprocess', 'inputImageBitmap' : frameImage});
}

video_worker.js:

self.addEventListener('message', function(e) {
  var data = e.data;
  switch (data.cmd) {
    case 'videoprocess':
      // do the invernt effect
      processOneFrame_RGBA(data.inputImageBitmap);
      // send back to the control worker
      postMessage({"type":"display_arraybuffer",
               "bitmap":event.data.bitmap});
      break;
    default:
      self.postMessage('Unknown command: ' + data.msg);
  };
}, false);

Even if we can do same thing with FrameGrabber simple extension, is it worth to add huge worker extensions such as VideoWorker and VideoWorkerGlobalScope? If so, could you explain? @huningxin , @anssiko, @robman please feedback

ChiahungTai commented 9 years ago

hi, ds, Every suggestion is welcome! Before talking about which one is faster, could you please help me figure out following 2 questions?

Can you give me an example to explain how to support UC6 in Media_Capture_Depth_Stream_Extension#Examples with FrameGrabber?
Is the FrameGrabber based on a assumption that stream is a MediaStream with only one MediaStreamTrack(kind:video)? How would you deal with the case of a MediaStream with multiple MediaStreamTracks(kind:video) in FrameGrabber? The spec of Media Capture & Stream allow to add MediaStreamTracks into existing MediaStream.

Thanks,

ds-hwang commented 9 years ago

@huningxin , @anssiko could you answer @ChiahungTai's question? I didn't fully digest MediaStreamTrack and MediaStream. Thanks.

anssiko commented 9 years ago

@huningxin Could you provide @ChiahungTai with an example of UC6 with FrameGrabber.
@ChiahungTai Currently that limitation is imposed by the API. However, to support two or more MediaStreamTracks of same kind in FrameGrabber we could update the FrameData interface as follows:

readonly attribute DepthMap[]? depthMap; readonly attribute ImageData[]? imageData;

Let me know if this sounds reasonable.

huningxin commented 9 years ago

@huningxin Could you provide @ChiahungTai with an example of UC6 with FrameGrabber.

FrameGrabber alone is not able to support UC6. We need an extension API to pipe the processed images to a new MediaStreamTrack.

VideoWorker provides this capability.

ds-hwang commented 9 years ago

I want to demonstrate pseudo-code for UC6. The main idea is

FrameGrabber dispatches video data and depth data as ImageBitmap
The callback of FrameGrabber dispatches both ImageBitmap to Web Worker
Web Worker does post processing of images and create ArrayBuffer as output image
Web Worker posts the ArrayBuffer to main thread
main thread draws the ArrayBuffer on Canvas or WebGL

Let's see pseudo-code

main.js

var canvasContext = document.createElement("canvas").getContext("2d");
var worker = new Worker("worker.js");

navigator.mediaDevices.getUserMedia({
  depth: true,
  video: true
}).then(function (mediaStream) {
  // mediaStream carries both video stream track and depth stream track
  var frameGrabber = new FrameGrabber(mediaStream);
  if (frameGrabber) {
    frameGrabber.start(processFrameData);
  }
});

function processFrameData(frameData) {
  // Assume frameData.imageBitmap and frameData.depthMap are ImageBitmap.
  worker.postMessage({'cmd': 'videoprocess', 'videoImage' : frameData.imageBitmap, 'depthImage' : frameData.depthMap });
}

worker.addEventListener('message', function(e) {
  var data = e.data;
  switch (data.cmd) {
    case 'output':
      drawArrayBufferOnCanvas(data.outputImage);
      break;
    default:
      self.postMessage('Unknown command: ' + data.msg);
  };
}, false);

function drawArrayBufferOnCanvas(arrayBuffer) {
  var imgData = new ImageData();
  imgData.data = new Uint8ClampedArray(arrayBuffer);
  // zero-copy
  canvasContext.putImageData(imgData);
}

worker.js:

self.addEventListener('message', function(e) {
  var data = e.data;
  switch (data.cmd) {
    case 'videoprocess':
      var output = new ArrayBuffer(size);
      processRemoveBackground(data.videoImage, data.depthImage, output);
      postMessage({'cmd': 'output', 'outputImage': output});
      break;
    default:
      self.postMessage('Unknown command: ' + data.msg);
  };
}, false);

If web developers want to make new stream of removed background output, CanvasCaptureMediaStream can be solution. http://www.w3.org/TR/mediacapture-fromelement/#the-canvascapturemediastream

In addition, in above use case, I'll use WebGL directly instead of usingWeb Worker. If I have to use OpenCV.js, I'll use Web Worker (or VideoWorker).

Can I listen to feedback? @ChiahungTai @anssiko @huningxin

anssiko commented 9 years ago

@ds-hwang Thanks for the example. Some feedback and questions:

Typo probably fixable with s/ImageData imgData;/var imgData = new ImageData();/
Is MEMU8 an instance of Uint8Array?
Any idea of the performance of drawArrayBufferOnCanvas i.e. would direct ArrayBuffer drawing be a requirement in practice?

@ChiahungTai What are the functional gaps (if any) in the @ds-hwang's proposal from your use case point of view? I think the alternative approach to implement UC6 using VideoWorker would be very much like in the WorkerProcessor example.

The main difference in the models seem to be (and please correct me if I'm wrong!):

The VideoWorker hides passing ImageBitmap objects between the threads, uses an event model and allows web developers to work directly with MediaStreamTrack objects in the main thread returned by the MediaStreamTrack.addWorkerProcessor() method.
The VideoWorker relies on a side effect i.e. the said MediaStreamTrack object in the main thread is updated transparently by the implementation to reflect changes done to the outputImageBitmap in the video worker, whereas in the FrameGrabber proposal the web developer is responsible postMessage'ing the ImageBitmap objects to the main thread.

Let's experiment with the proposals and see which would provide the best API ergonomics for the web developers while also be implementable and performant.

@huningxin Please review and provide your feedback :-)

ds-hwang commented 9 years ago

@anssiko thx for typo. I fixed. To optimize canvas-drawing, I have three idea.

extend context2d.drawImage or context2d.putImageData to handle ArrayBuffer
extend Window.createImageBitmap to wrap ArrayBuffer and extend context2d.drawImage to handle ImageBitmap.

The second option is more general.

ds-hwang commented 9 years ago

@anssiko I found that zero-copy canvas-drawing is possible already. I applied it to the example. Uint8ClampedArray is one of Typed Array View, which is used in ImageData

huningxin commented 9 years ago

FrameGrabber + Web Worker solution seems to have more dependencies:

WebWorker needs to be extended to support post ImageBitmap without memory copy (transferable objects).
Canvas needs to be extended to support drawing from ArrayBuffer. And please note that the color capture might be YUV other than RGB.
depends on Media Capture from DOM Elements

ds-hwang commented 9 years ago

Reply to @huningxin

VideoWorker also needs similar implementation internally to transfer ImageBitmap to worker scope without copy. It might not require ImageBitmap to inherit Transferable but each browser still needs similar implementation for ImageBitmap.

please check again the example. Canvas can draw ArrayBuffer already.

function drawArrayBufferOnCanvas(arrayBuffer) {
var imgData = new ImageData();
imgData.data = new Uint8ClampedArray(arrayBuffer);
// zero-copy
canvasContext.putImageData(imgData);
}

indeed. anyway, re-streaming is not main topic of this spec. We need more discussion to handle re-streaming gracefully. I think this spec needs to discuss about it with the http://www.w3.org/TR/mediacapture-fromelement

huningxin commented 9 years ago

please check again the example. Canvas can draw ArrayBuffer already.

Thanks, @ds-hwang. I agree it can be done via the polyfill.

Regarding to performance, I concern the two new operations in the polyfill (for every frame) would hurt the performance.

And current Canvas only supports RGBA, So if native video is YUV format (quite common), in your proposal, Web App needs to do one more color conversion just for drawing on canvas. It would also hurt the performance. However, ImageBitmap supports YUV which can pipeline to outputImageBitmap directly.

ds-hwang commented 9 years ago

I'm not sure how performance can be affected by two new op for wrapper objects. ImageData can be reused.

Format comment makes me have new concern. While native video is YUV format, GPU decoder in Chromium decodes it to RGBA texture. ImageBitmap will have different format for the same video per browser. e.g. Chromium uses RGBA while Firefox uses YUV. Is this implementation difference acceptable? @ChiahungTai

ChiahungTai commented 9 years ago

Sorry for late reply. I was busy at the demo and prepared the patches for code review process. I will reply all the questions in recent days. But before answer the questions, I have one question want to confirm with @ds-hwang. @ds-hwang, except CanvasRendering2D and WebGL, is there any particularly reason why you want this spec cover main thread?

ds-hwang commented 9 years ago

@ChiahungTai In my mind, I don't have other use case except for CanvasRendering2D and WebGL. By the way, I want to ask opposite question. Is there any particularly reason why you want this spec cover only worker thread? AFAIK, until now, we don't have any W3C spec to support only Worker exclusively.

ChiahungTai commented 9 years ago

@ds-hwang because most of video processing or analysis task could be heavy loading, so most of works should be running in worker thread. If CanvasRendering2D and WebGL are the only reason, how do you think about making CanvasRendering2D and WebGL executed in Worker? If we can make those happen, do you still think this spec should cover main thread?

ChiahungTai commented 9 years ago

@ds-hwang IMO, if we can push CanvasRendering2D and WebGL executed in Worker, then there is no reason to cover main thread in this spec. We can use existing PoseMessage mechanism to communicate and co-work with main thread. We also keep the design simple and minimal changes. So that is my reason.

ds-hwang commented 9 years ago

@ChiahungTai some use case doesn't require heavy computation. let me put good example. Think about user recording himself in front of computer, grabbing depth stream and rgb stream, and removing background to show only him. In this case, I'll use WebGL on the main thread. All I have to do is to upload depth and rbg stream to textures and render rgb texture pixel which depth pixel is less than 1 m. If I have to use WebGL on Worker thread, the code will be bloat as well as I don't think there is performance benefit.

On the other hands, from my perspective, FrameGrabber is more minimal change than VideoWorker. FrameGrabber just provide mechanism to grab frame, but VideoWorker is huge black box. If I were to implement VideoWorker, I'll probably create FrameGrabber c++ implementation and use postMessage in c++. I'm sure other implementer think differently. Which mean there might be subtle difference between browsers.

@anssiko, @huningxin WDYT?

rocallahan commented 9 years ago

The HTML "main thread" is not a good place to do real-time media processing. Many application activities must run on the main thread, such as DOM manipulation. Those activities impose unpredictable latency on any media processing callbacks dispatched to run on the main thread. This is true even if those media processing callbacks themselves take very little time to run. For this reason (and others), the Audio WG is deprecating main-thread audio processing and moving audio processing to a dedicated AudioWorker API. Video deserves the same treatment.

Plus, of course, doing processing in a Worker enables more parallelism for applications where the processing actually is expensive --- and there will be plenty of those.

The approach of using FrameGrabber and drawing to a canvas won't work very well when you need transformed video in a VideoStreamTrack. In Gecko you can use captureStream to get a VideoStreamTrack from the canvas, but you'll lose A/V sync information. That's possibly fixable with more API work, but it's not clean. A related issue is handling slow video processing callbacks. VideoWorker automatically backs off and drops frames as necessary. Will FrameGrabber do that?

ds-hwang commented 9 years ago

Those activities impose unpredictable latency on any media processing callbacks dispatched to run on the main thread. This is true even if those media processing callbacks themselves take very little time to run.

I know. It's why Web Worker was introduced. However, in some case, communication overhead is bigger than actual computation like my above example.

For this reason (and others), the Audio WG is deprecating main-thread audio processing and moving audio processing to a dedicated AudioWorker API. Video deserves the same treatment.

It's interesting. I found Audio Worker API is very similar to Video Worker API. https://webaudio.github.io/web-audio-api/#audio-worker-examples I'm not sure if it's necessary to extend Worker API. I think well defined Transferable object and 'postMessage' combination is enough. Audio Worker and Video Workershould be either accepted together or rejected together. @padenot

VideoWorker automatically backs off and drops frames as necessary. Will FrameGrabber do that?

If VideoWorker can do, I don't find any reason that 'FrameGrabber' cannot do.

ChiahungTai / mediacapture-worker

Is VideoWorker faster than FrameGrabber + Worker? #3