Closed vladmandic closed 4 years ago
What is the performance you see without web workers, how many milliseconds for a single detection the ui thread is blocked and what fps you can achieve?
depends on enabled modules - simple face detection, 50+ FPS, it can drop to 5-10FPS if everything is enabled. and that's on a low-end GPU.
but main thread blocking is minimal regardless of performance as everything is written in asynchronous way and you can use a promise to wait for result so main thread is free to do whatever during detection. even weights loading is asynchronous.
Will be happy to try the library but we must have smile detection, do you have plans for face expressions?
not immediately, but good idea.
just a thought: given this library has much more detailed face geometry (468 points for a base face alone, plus extras), it may be ok to extrapolate simple expressions using points math instead of using ml model.
i already do some simple stuff like:
gestures.push(`facing ${((face.annotations['rightCheek'][0][2] > 0) || (face.annotations['leftCheek'][0][2] < 0)) ? 'right' : 'left'}`);
or
const leftShoulder = pose.keypoints.find((a) => (a.score > params.minThreshold) && (a.part === 'leftShoulder'));
const rightShoulder = pose.keypoints.find((a) => (a.score > params.minThreshold) && (a.part === 'rightShoulder'));
gestures.push(`leaning ${(leftShoulder.position.y > rightShoulder.position.y) ? 'left' : 'right'}`);
(plenty room for optimization)
in either case, web workers are supported and there is demo using them in demo/demo-webworker
, i'm just not sure i'd use them much personally.
thanks, we will take a look, maybe we can even contribute
i'm totally open for that!
@ost12666 regarding emotion detection
i've just added it and updated docs. really like the idea. it's based on a really small (200kb) tfjs model and seems to work ok for when face is front-facing the camera and not so much for side poses.
i'm closing this issue as web worker support exists and i'll maintain it moving forward.
revising web workers - they are actually VERY usefull - when using webgl
backend and idle main thread, there is no point since every atomic function is so fast that main thread is never blocked for more than few ms.
but...when using cpu
or wasm
backends, it kills responsiveness of the main thread and overall performance drop is huge. however, running same in the worker thread works like a charm - up to a point that if you have good cpu, it works almost as fast as with 'webgl` and main thread never suffers!
even few ms is too much so maybe it will help also with webgl?
it makes it slower by 2-3 fps (not more), but UI is 100% responsive - so it's a tradeoff.
How do you transfer data to the web worker?
const offscreen = new OffscreenCanvas(input.width, input.width);
const ctx = offscreen.getContext('2d');
ctx.drawImage(input, 0, 0, input.width, input.height, 0, 0, input.width, input.width);
const data = ctx.getImageData(0, 0, input.width, input.width);
worker.postMessage({ data });
where input can be anything, but typically is HTMLVideoElement.
without web worker, there is no need for this intermediary offscreen canvas at all since human.detect(input)
accepts DOM element directly.
On some browsers you can use transferables for zero copy cost https://developer.mozilla.org/en-US/docs/Web/API/Transferable
yeah, but it's a catch#22 - you can only do transferrable on a canvas without a context - which means i can't paint it before i transfer it and thus there is nothing to detect inside the worker :(
I mean transfer the image data not the canvas: https://benjaminbenben.com/2013/04/14/webworker-qr/
Transferrable exists for ArrayBuffer
, MessagePort
, ImageBitmap
and OffscreenCanvas
.
So how do I get data from HTMLVideoElement frame other than placing it on an OffscreenCanvas?
Unfortunately, HTMLVideoElement does not have imageData property.
Without workers, I can ready directly from HTMLVideoElement using tf.browser.fromPixels()
and avoid canvas for read operations completely - thus the performance difference.
From the link above: worker.postMessage(imagedata, [imagedata.data.buffer]);
I am not sure what happens to the actual imagedata
https://www.kevinhoyt.com/2018/10/31/transferable-imagedata/
That might cut down some processing as it's passing data by reference instead of by value.
Still need intermediary OffscreenCanvas
to draw on and get data from it using getImageData()
Should be an improvement, although still some overhead remains - I'll update with results soon.
It's better, definitely decreases latency from ~22ms to ~15ms. But it's still slower than executing in the main thread without the need to intermediary canvas or passing messages.
I also did a simple test just drawing image on canvas and passing its image buffer as reference to worker and measuring the round trip (worker does 0 actual work, just to measure the round trip) and it's about ~15ms
and that's a constant, so:
like i said earlier - its a tradeoff - slightly lower FPS .vs. responsive UI - every user can make their own choice.
thanks, you are awesome, I hope you get some sleep from time to time :)
naah, sleeping is overrated :)
did you notice recently added emotion detection?
not in my age!
I noticed, thanks a lot! we are going to integrate it soon instead of face-api and provide you with feedback
Human
is compatible with new web workers, but...web workers are finicky:
HTMLImage
orHTMLVideo
to web worker, so need to pass canvas insteadtransferControlToOffscreen()
and then becomeoffscreenCanvas
which can be passed to worker, but...
cannot transfer canvas that has a rendering context (basically, first time
getContext()
is executed on it)which means that if we pass main canvas that will be used to render results on,
then all operations on it must be within webworker and we cannot touch it in the main thread at all.
doable, but...how to paint a video frame on it before we pass it?
so we create new
offscreenCanvas
that we drew video frame on and pass it'simageData
and return results from worker, but then there is an overhead of creating it and passing large messages between main thread and worker - it ends up being slower than executing in the main thread.
Human
already executes everything inasync/await
manner and avoids synchronous operations as much as possible so it doesn't block the main thread, so not sure what is the benefit of web workers (unless main thread is generally a very busy one)?