Open marine008 opened 6 years ago
What kind of gpu are you using? I tried to run this on a intel gpu once and got the exact same picture. Maybe tfjs doesn't support intel gpus.
thank you . yes , I use the intel hd gpu. maybe I should change the computer
I see, maybe we should also report this to the tfjs team at some point. Unfortunately I only have access to a intel gpu sporadically, otherwise I could try to debug step by step, which operations are causing to return inconsistent results.
I will try to do it. but I should know the consistent results first.
I had the exact same issue and fixed it by changing the default GPU for the browser to my Nvidia GPU.
Nvidia has decided to disable the GPU by default for Chrome and Firefox and use the Intel HD GPU. Source: https://superuser.com/questions/645918/how-to-run-google-chrome-with-nvidia-card-optimus
@marine008 @justadudewhohacks @thexiroy Recently, I had investigated some precision issues on Intel GPUs. I had tested this issue on different Intel GPUs, but could not reproduce it. Testing platforms were as below, Intel(R) UHD Graphics 630 + win10 Intel(R) HD Graphics 530 + win 10 Intel(R) HD Graphics 630 + win 10 Intel(R) UHD Graphics 630 + Ubuntu 18.10 Intel(R) HD Graphics 530 + Ubuntu 17.10 Intel(R) HD Graphics 630 + Ubuntu 18.04
I want to know which platform could reproduce this issue. Was an old GPU on which you produced this issue? Thank you.
@xhcao Intel(R) HD Graphics 4600 + win10
Hi, I'm from the TensorFlow.js team (which face-api.js uses). Can someone let me know if you can reproduce these precision problems using the latest tf@1.0.2? Thank you!
Latest version of face-api.js now runs on tfjs-core 1.0.3. The initial issue occured in the ssd mobilenet v1 face detector. One can simply verify if it's working now by running the face detection example, the ssd face detector is selected by default. So if any Intel GPU user would give that a try, that would be nice.
Thanks! We recently got a Lenovo Yoga X1 Windows laptop with integrated Intel HD 520 GPU.
I just cloned the face-api.js repo, ran the examples and couldn't reproduce the problem.
Thanks. I could not reproduce this issue on Intel(R) HD Graphics 4600 + win 10 platform. Do you know the root-cause?
TF.js went through a lot of changes (packing, better memory layout/indexing) so it's hard to tell what exactly helped with numerical stability, but we do know that the Haswell
chipset (Intel Graphics 4600) was the one that had numerical issues. It seems to me that we can close this issue.
@dsmilkov I get access to an Intel GPU next week, I will double check if everything works as expect and if so will close here.
@dsmilkov I could verify, that the SSD mobilenet model, which was subject of this issue, now works as expected on an Intel gpu.
I noticed some precision differences between amd and intel gpu though, which can be seen in the output of the landmark detection model:
Not sure if the info helps, but except of an intitial regular tf.conv2d the model is composed of depthwise seperable convolutions, followed by a fully connected layer (tf.matMul) at the end.
Thanks. That's good feedback. If you find some extra time, would love if you can diff the outputs after each internal operation (activations) between the amd and the intel gpu. I'm curious to see if the difference start to occur after a specific op, or they slowly drift over multiple different ops.
I download the source code and npm the example like the 'readme.md' said, but the result isn't right. the detection draw on the canvas like the pic. why this happen?