Open ds-hwang opened 9 years ago
hi, ds, Every suggestion is welcome! Before talking about which one is faster, could you please help me figure out following 2 questions?
FrameGrabber
? FrameGrabber
based on a assumption that stream
is a MediaStream
with only one MediaStreamTrack
(kind:video)? How would you deal with the case of a MediaStream
with multiple MediaStreamTrack
s(kind:video) in FrameGrabber
? The spec of Media Capture & Stream allow to add MediaStreamTrack
s into existing MediaStream
.Thanks,
@huningxin , @anssiko could you answer @ChiahungTai's question? I didn't fully digest MediaStreamTrack
and MediaStream
. Thanks.
FrameGrabber
.MediaStreamTrack
s of same kind in FrameGrabber
we could update the FrameData
interface as follows:readonly attribute DepthMap[]? depthMap;
readonly attribute ImageData[]? imageData;
Let me know if this sounds reasonable.
@huningxin Could you provide @ChiahungTai with an example of UC6 with FrameGrabber.
FrameGrabber
alone is not able to support UC6. We need an extension API to pipe the processed images to a new MediaStreamTrack
.
VideoWorker
provides this capability.
I want to demonstrate pseudo-code for UC6
.
The main idea is
FrameGrabber
dispatches video data and depth data as ImageBitmap
FrameGrabber
dispatches both ImageBitmap
to Web Worker
Web Worker
does post processing of images and create ArrayBuffer
as output imageWeb Worker
posts the ArrayBuffer
to main threadArrayBuffer
on Canvas or WebGLLet's see pseudo-code
var canvasContext = document.createElement("canvas").getContext("2d");
var worker = new Worker("worker.js");
navigator.mediaDevices.getUserMedia({
depth: true,
video: true
}).then(function (mediaStream) {
// mediaStream carries both video stream track and depth stream track
var frameGrabber = new FrameGrabber(mediaStream);
if (frameGrabber) {
frameGrabber.start(processFrameData);
}
});
function processFrameData(frameData) {
// Assume frameData.imageBitmap and frameData.depthMap are ImageBitmap.
worker.postMessage({'cmd': 'videoprocess', 'videoImage' : frameData.imageBitmap, 'depthImage' : frameData.depthMap });
}
worker.addEventListener('message', function(e) {
var data = e.data;
switch (data.cmd) {
case 'output':
drawArrayBufferOnCanvas(data.outputImage);
break;
default:
self.postMessage('Unknown command: ' + data.msg);
};
}, false);
function drawArrayBufferOnCanvas(arrayBuffer) {
var imgData = new ImageData();
imgData.data = new Uint8ClampedArray(arrayBuffer);
// zero-copy
canvasContext.putImageData(imgData);
}
self.addEventListener('message', function(e) {
var data = e.data;
switch (data.cmd) {
case 'videoprocess':
var output = new ArrayBuffer(size);
processRemoveBackground(data.videoImage, data.depthImage, output);
postMessage({'cmd': 'output', 'outputImage': output});
break;
default:
self.postMessage('Unknown command: ' + data.msg);
};
}, false);
If web developers want to make new stream of removed background output, CanvasCaptureMediaStream
can be solution. http://www.w3.org/TR/mediacapture-fromelement/#the-canvascapturemediastream
In addition, in above use case, I'll use WebGL
directly instead of usingWeb Worker
. If I have to use OpenCV.js
, I'll use Web Worker
(or VideoWorker).
Can I listen to feedback? @ChiahungTai @anssiko @huningxin
@ds-hwang Thanks for the example. Some feedback and questions:
s/ImageData imgData;/var imgData = new ImageData();/
MEMU8
an instance of Uint8Array
?drawArrayBufferOnCanvas
i.e. would direct ArrayBuffer
drawing be a requirement in practice?@ChiahungTai What are the functional gaps (if any) in the @ds-hwang's proposal from your use case point of view? I think the alternative approach to implement UC6 using VideoWorker
would be very much like in the WorkerProcessor example.
The main difference in the models seem to be (and please correct me if I'm wrong!):
VideoWorker
hides passing ImageBitmap
objects between the threads, uses an event model and allows web developers to work directly with MediaStreamTrack
objects in the main thread returned by the MediaStreamTrack.addWorkerProcessor()
method.VideoWorker
relies on a side effect i.e. the said MediaStreamTrack
object in the main thread is updated transparently by the implementation to reflect changes done to the outputImageBitmap
in the video worker, whereas in the FrameGrabber
proposal the web developer is responsible postMessage
'ing the ImageBitmap
objects to the main thread.Let's experiment with the proposals and see which would provide the best API ergonomics for the web developers while also be implementable and performant.
@huningxin Please review and provide your feedback :-)
@anssiko thx for typo. I fixed. To optimize canvas-drawing, I have three idea.
context2d.drawImage
or context2d.putImageData
to handle ArrayBuffer
Window.createImageBitmap
to wrap ArrayBuffer
and extend context2d.drawImage
to handle ImageBitmap
.The second option is more general.
@anssiko I found that zero-copy canvas-drawing is possible already. I applied it to the example.
Uint8ClampedArray
is one of Typed Array View
, which is used in ImageData
FrameGrabber
+ Web Worker
solution seems to have more dependencies:
WebWorker
needs to be extended to support post ImageBitmap
without memory copy (transferable objects).Canvas
needs to be extended to support drawing from ArrayBuffer
. And please note that the color capture might be YUV other than RGB.Reply to @huningxin
VideoWorker
also needs similar implementation internally to transfer ImageBitmap
to worker scope without copy. It might not require ImageBitmap
to inherit Transferable
but each browser still needs similar implementation for ImageBitmap
.please check again the example. Canvas can draw ArrayBuffer
already.
function drawArrayBufferOnCanvas(arrayBuffer) {
var imgData = new ImageData();
imgData.data = new Uint8ClampedArray(arrayBuffer);
// zero-copy
canvasContext.putImageData(imgData);
}
please check again the example. Canvas can draw ArrayBuffer already.
Thanks, @ds-hwang. I agree it can be done via the polyfill.
Regarding to performance, I concern the two new
operations in the polyfill (for every frame) would hurt the performance.
And current Canvas
only supports RGBA, So if native video is YUV format (quite common), in your proposal, Web App needs to do one more color conversion just for drawing on canvas. It would also hurt the performance. However, ImageBitmap
supports YUV which can pipeline to outputImageBitmap
directly.
I'm not sure how performance can be affected by two new
op for wrapper objects. ImageData
can be reused.
Format comment makes me have new concern. While native video is YUV format, GPU decoder in Chromium decodes it to RGBA texture. ImageBitmap
will have different format for the same video per browser. e.g. Chromium uses RGBA while Firefox uses YUV. Is this implementation difference acceptable? @ChiahungTai
Sorry for late reply. I was busy at the demo and prepared the patches for code review process. I will reply all the questions in recent days. But before answer the questions, I have one question want to confirm with @ds-hwang. @ds-hwang, except CanvasRendering2D and WebGL, is there any particularly reason why you want this spec cover main thread?
@ChiahungTai In my mind, I don't have other use case except for CanvasRendering2D and WebGL. By the way, I want to ask opposite question. Is there any particularly reason why you want this spec cover only worker thread? AFAIK, until now, we don't have any W3C spec to support only Worker exclusively.
@ds-hwang because most of video processing or analysis task could be heavy loading, so most of works should be running in worker thread. If CanvasRendering2D and WebGL are the only reason, how do you think about making CanvasRendering2D and WebGL executed in Worker? If we can make those happen, do you still think this spec should cover main thread?
@ds-hwang IMO, if we can push CanvasRendering2D and WebGL executed in Worker, then there is no reason to cover main thread in this spec. We can use existing PoseMessage mechanism to communicate and co-work with main thread. We also keep the design simple and minimal changes. So that is my reason.
@ChiahungTai some use case doesn't require heavy computation. let me put good example. Think about user recording himself in front of computer, grabbing depth stream and rgb stream, and removing background to show only him. In this case, I'll use WebGL on the main thread. All I have to do is to upload depth and rbg stream to textures and render rgb texture pixel which depth pixel is less than 1 m. If I have to use WebGL on Worker thread, the code will be bloat as well as I don't think there is performance benefit.
On the other hands, from my perspective, FrameGrabber
is more minimal change than VideoWorker
. FrameGrabber
just provide mechanism to grab frame, but VideoWorker
is huge black box. If I were to implement VideoWorker
, I'll probably create FrameGrabber
c++ implementation and use postMessage
in c++. I'm sure other implementer think differently. Which mean there might be subtle difference between browsers.
@anssiko, @huningxin WDYT?
The HTML "main thread" is not a good place to do real-time media processing. Many application activities must run on the main thread, such as DOM manipulation. Those activities impose unpredictable latency on any media processing callbacks dispatched to run on the main thread. This is true even if those media processing callbacks themselves take very little time to run. For this reason (and others), the Audio WG is deprecating main-thread audio processing and moving audio processing to a dedicated AudioWorker API. Video deserves the same treatment.
Plus, of course, doing processing in a Worker enables more parallelism for applications where the processing actually is expensive --- and there will be plenty of those.
The approach of using FrameGrabber and drawing to a canvas won't work very well when you need transformed video in a VideoStreamTrack. In Gecko you can use captureStream to get a VideoStreamTrack from the canvas, but you'll lose A/V sync information. That's possibly fixable with more API work, but it's not clean. A related issue is handling slow video processing callbacks. VideoWorker automatically backs off and drops frames as necessary. Will FrameGrabber do that?
Those activities impose unpredictable latency on any media processing callbacks dispatched to run on the main thread. This is true even if those media processing callbacks themselves take very little time to run.
I know. It's why Web Worker was introduced. However, in some case, communication overhead is bigger than actual computation like my above example.
For this reason (and others), the Audio WG is deprecating main-thread audio processing and moving audio processing to a dedicated AudioWorker API. Video deserves the same treatment.
It's interesting. I found Audio Worker
API is very similar to Video Worker
API. https://webaudio.github.io/web-audio-api/#audio-worker-examples
I'm not sure if it's necessary to extend Worker
API. I think well defined Transferable object
and 'postMessage' combination is enough. Audio Worker
and Video Worker
should be either accepted together or rejected together. @padenot
VideoWorker automatically backs off and drops frames as necessary. Will FrameGrabber do that?
If VideoWorker
can do, I don't find any reason that 'FrameGrabber' cannot do.
Hi, I'm fan of this spec but I have a question and I hope this spec covers main thread also.
FrameGrabber
in Media Capture Depth Stream Extensions has same purpose ofVideoWorker
except for web worker. We can mimicVideoWorker
usingFrameGrabber
andWorker
video_worker.js
:Even if we can do same thing with
FrameGrabber
simple extension, is it worth to add huge worker extensions such asVideoWorker
andVideoWorkerGlobalScope
? If so, could you explain? @huningxin , @anssiko, @robman please feedback