[Intel GPU] - run the example but the result isn‘t right

justadudewhohacks / face-api.js

JavaScript API for face detection and face recognition in the browser and nodejs with tensorflow.js

MIT License

16.7k stars 3.72k forks source link

[Intel GPU] - run the example but the result isn‘t right #43

Open marine008 opened 6 years ago

marine008 commented 6 years ago

I download the source code and npm the example like the 'readme.md' said, but the result isn't right. the detection draw on the canvas like the pic. why this happen?

qq 20180708182022

justadudewhohacks commented 6 years ago

What kind of gpu are you using? I tried to run this on a intel gpu once and got the exact same picture. Maybe tfjs doesn't support intel gpus.

marine008 commented 6 years ago

thank you . yes , I use the intel hd gpu. maybe I should change the computer

justadudewhohacks commented 6 years ago

I see, maybe we should also report this to the tfjs team at some point. Unfortunately I only have access to a intel gpu sporadically, otherwise I could try to debug step by step, which operations are causing to return inconsistent results.

marine008 commented 6 years ago

I will try to do it. but I should know the consistent results first.

thexiroy commented 6 years ago

I had the exact same issue and fixed it by changing the default GPU for the browser to my Nvidia GPU.

Nvidia has decided to disable the GPU by default for Chrome and Firefox and use the Intel HD GPU. Source: https://superuser.com/questions/645918/how-to-run-google-chrome-with-nvidia-card-optimus

xhcao commented 5 years ago

@marine008 @justadudewhohacks @thexiroy Recently, I had investigated some precision issues on Intel GPUs. I had tested this issue on different Intel GPUs, but could not reproduce it. Testing platforms were as below, Intel(R) UHD Graphics 630 + win10 Intel(R) HD Graphics 530 + win 10 Intel(R) HD Graphics 630 + win 10 Intel(R) UHD Graphics 630 + Ubuntu 18.10 Intel(R) HD Graphics 530 + Ubuntu 17.10 Intel(R) HD Graphics 630 + Ubuntu 18.04

I want to know which platform could reproduce this issue. Was an old GPU on which you produced this issue? Thank you.

thexiroy commented 5 years ago

@xhcao Intel(R) HD Graphics 4600 + win10

dsmilkov commented 5 years ago

Hi, I'm from the TensorFlow.js team (which face-api.js uses). Can someone let me know if you can reproduce these precision problems using the latest tf@1.0.2? Thank you!

justadudewhohacks commented 5 years ago

Latest version of face-api.js now runs on tfjs-core 1.0.3. The initial issue occured in the ssd mobilenet v1 face detector. One can simply verify if it's working now by running the face detection example, the ssd face detector is selected by default. So if any Intel GPU user would give that a try, that would be nice.

dsmilkov commented 5 years ago

Thanks! We recently got a Lenovo Yoga X1 Windows laptop with integrated Intel HD 520 GPU.

I just cloned the face-api.js repo, ran the examples and couldn't reproduce the problem.

xhcao commented 5 years ago

Thanks. I could not reproduce this issue on Intel(R) HD Graphics 4600 + win 10 platform. Do you know the root-cause?

dsmilkov commented 5 years ago

TF.js went through a lot of changes (packing, better memory layout/indexing) so it's hard to tell what exactly helped with numerical stability, but we do know that the Haswell chipset (Intel Graphics 4600) was the one that had numerical issues. It seems to me that we can close this issue.

justadudewhohacks commented 5 years ago

@dsmilkov I get access to an Intel GPU next week, I will double check if everything works as expect and if so will close here.

justadudewhohacks commented 5 years ago

@dsmilkov I could verify, that the SSD mobilenet model, which was subject of this issue, now works as expected on an Intel gpu.

I noticed some precision differences between amd and intel gpu though, which can be seen in the output of the landmark detection model:

AMD:

landmarks-amd-gpu

Intel:

landmarks-intel-gpu

Not sure if the info helps, but except of an intitial regular tf.conv2d the model is composed of depthwise seperable convolutions, followed by a fully connected layer (tf.matMul) at the end.

dsmilkov commented 5 years ago

Thanks. That's good feedback. If you find some extra time, would love if you can diff the outputs after each internal operation (activations) between the amd and the intel gpu. I'm curious to see if the difference start to occur after a specific op, or they slowly drift over multiple different ops.