chihfanhsu / gaze_correction

Correcting gaze by warping-based convolutional neural network in live video communication
275 stars 51 forks source link

Lag/program doesn't work properly #7

Closed johnnin closed 4 years ago

johnnin commented 4 years ago

Very cool project! I changed the config file with my screen and webcam position then I managed to open the focal lenght calibration and the program itself (regz_socket_MP_FD) but when I press r (watching the remote window) to calibrate my gaze the remote window start to lag (you say it's normal for a few seconds) but for me keeps like this forever until i press q to terminate. I've installed every Required packages (all the same version as yours) ,same version of python as well as tensorflow and cuda (with cuDnn). Do you have any clue where is the problem? Thanks

johnnin commented 4 years ago

Also : tensorflow has to be the cpu or gpu version? I've installed both of them (1.8.0) Cuda and cuDnn are required in order to work? Since the first time I started the program I didn't have them and worked (still with lag) and even when I installed them still lag I didn't manage to find numpy 1.15.4 + mkl for python 3.5.3 but just numpy 1.15.4 without mkl, maybe it's that the cause? If yes do you know where I could find the full package?

chihfanhsu commented 4 years ago

I use the GPU-version TensorFlow for synthesizing eye images. Cuda and cuDnn are required. For the lag issue, in my experience, it is caused by the lag of the first start of the TensorFlow (GPU version). Them, the lag will be propagated to the transmission of the UDP packets. The simple solution to solve this is to push r again to stop synthesizing first. Until the lag disappears (sometimes requires few seconds), push r again and it should be fine.

You can download the numpy+kml in this URL. I am not sure this package is one reason for the lag or not because this is a requirement for by TensorFlow installation. Otherwise, there is an error message when I run the GPU version TensorFlow.

johnnin commented 4 years ago

Thanks for your quick response. I've downloaded and installed numpy 1.16.6 + mkl since I didn't find the 1.15.4 version in the link you kindly sent to me . Then I uninstalled tensorflow(CPU) keeping only the GPU version but when I tried to launch the program it was giving me the error that tensorflow was missing so I reinstalled it. I tried push again the r button and wait as you suggested but after the "lag" stops and I press again the r button it still gives me the same result. I try to be more specific because "lag" is very generic: when I click the button(I think) the program makes me an initial photo and then when I start moving it tries to connect it with my current movement but failing someway it proposes the initial snapshot and overlaps it with the current webcam video input... so this is the order: initial photo->current video->initial photo->current video.

chihfanhsu commented 4 years ago

I think the version of the numpy+kml should be fine. If you really care about the version, you can download the version I used in this URL. Have you checked the GPU work properly? you may try this method, URL. I think the lag is caused by your system still uses CPU to synthesize the eye images.

johnnin commented 4 years ago

I think you're right, I may have to swap to the GPU method, tomorrow I'll try it out, hope will work out, thanks a lot for now

johnnin commented 4 years ago

I ended up installing anaconda with python 3.5, I also installed tensorflow(only GPU version this time) with all others packages. When this time I start the program there is the initial lag and then when I click again r the remote windows finally goes smooth without lag but it seems that the program isn't working: I gaze my eyes on the remote window and press r, i get the upper right screen with eye, alpha and r_w infos but then nothing happens, the local and remote windows are the same. Here's what I get: Namespace(P_IDP=6.3, P_c_x=0, P_c_y=-15, P_c_z=-7, S_H=45, S_W=80, agl_dim=2, ch annel=3, ef_dim=12, encoded_agl_dim=16, f=560, height=48, mod='flx', record_time =False, recver_port=5005, sender_port=5005, tar_ip='localhost', uid='local', wei ght_set='weights', width=64) (1366, 768) Namespace(P_IDP=6.3, P_c_x=0, P_c_y=-15, P_c_z=-7, S_H=45, S_W=80, agl_dim=2, ch annel=3, ef_dim=12, encoded_agl_dim=16, f=560, height=48, mod='flx', record_time =False, recver_port=5005, sender_port=5005, tar_ip='localhost', uid='local', wei ght_set='weights', width=64) (1366, 768) Socket created Socket now listening Namespace(P_IDP=6.3, P_c_x=0, P_c_y=-15, P_c_z=-7, S_H=45, S_W=80, agl_dim=2, ch annel=3, ef_dim=12, encoded_agl_dim=16, f=560, height=48, mod='flx', record_time =False, recver_port=5005, sender_port=5005, tar_ip='localhost', uid='local', wei ght_set='weights', width=64) (1366, 768) Loading model of [L] eye to GPU 2020-06-24 14:07:55.521436: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2020-06-24 14:07:55.972708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 405] Found device 0 with properties: name: GeForce GTX 750 Ti major: 5 minor: 0 memoryClockRate(GHz): 1.137 pciBusID: 0000:23:00.0 totalMemory: 2.00GiB freeMemory: 1.67GiB 2020-06-24 14:07:55.979239: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 484] Adding visible gpu devices: 0 payload_size: 4 2020-06-24 14:07:56.967862: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 65] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-24 14:07:56.973606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 71] 0 2020-06-24 14:07:56.977344: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 84] 0: N 2020-06-24 14:07:56.981471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 wit h 1424 MB memory) -> physical GPU (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:23:00.0, compute capability: 5.0) Loading model of [R] eye to GPU 2020-06-24 14:07:58.644839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 484] Adding visible gpu devices: 0 2020-06-24 14:07:58.649484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 65] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-24 14:07:58.655176: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 71] 0 2020-06-24 14:07:58.658858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:9 84] 0: N 2020-06-24 14:07:58.662978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1 097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 wit h 1424 MB memory) -> physical GPU (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:23:00.0, compute capability: 5.0) A:\Users\John\Downloads\gaze_correction-master\gaze_correction_system\regz_soc ket_MP_FD.py:109: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison if frame == 'stop': stop

chihfanhsu commented 4 years ago

The GPU works properly and the weights of the left eye and right eye are successfully loaded. If the parameter, alpha, and r_w are shown on screen, the system works. Once these parameters are all zeros which means the eyes are not detected by the Dlib. If the camera places super close to the screen center, sometimes the eye images won't' be changed.

Since your camera position is set to P_c_x=0 cm (horizontal), P_c_y=-15 cm (vertical), P_c_z=-7 cm (perpendicular to the screen surface), which is located in front of the screen with width 80 cm and height 45 cm (you have a very large screen) I think this is the reason. However, I think some differences between the synthesized and original images still can be observed. I suggest that you can rapidly push the r to see the difference between the synthesized and original images. To see more changes, you can set the P_c_y from -15 cm (vertical) to -30 or -25 and P_c_z from -7 cm to -0 for the larger difference. However, the adjustment angle (alpha) will be incorrect.

johnnin commented 4 years ago

Okay, I tried moving the webcam just on the horizontal and it finally seems to work now even if I don't know why the right eye gaze goes a little too far when i press r (the left eye gaze redirection is good), I tried changing a little the P_IDP value (6, 6.3, 6.5, 6.7) but it doesn't seem to change, maybe I'll try again with a smaller screen xD , still a great program btw

chihfanhsu commented 4 years ago

You can test different system parameters to slightly fix the right eye. Besides, the redirection model has several rooms that can be improved further. For now, the left eye and right eye are synthesized with separate models and were trained independently. Ideally, I think it will perform better by joint training .

johnnin commented 4 years ago

I tried different parameters as you suggested and now it works very good! I was thinking if there is a way to keep the gaze centered on the webcam no matter what direction I would look (example: I watch to the left-> gaze redirected to the center, I watch to the right -> gaze still redirected to the center) I don't know if it's a hard request or it's just to adjust some parameters in the py file but if you have any ideas I would appreciate it, for now thank you so much for the help

chihfanhsu commented 4 years ago

Yes, it could fix the gaze direction always gazing at the camera. In this architecture, an additional model for detecting the current gaze direction is needed. I think adding an eye tracker will be better for practice. However, if the original gaze to a position far from the camera, the distortion becomes observable in the synthesized images because of a large adjustment angle. Using GAN to synthesize the eye images might be another solution.

johnnin commented 4 years ago

For an additional model do you mean another class(like an eye tracker) to detect current gaze direction in the regz_socket_MP_FD.py file? Then in order to keep the gaze always centered should I modify something in your class gaze_redirection_system?

chihfanhsu commented 4 years ago

Yes, the modification is needed in the estimation of the alpha. Namely, the function of shifting_angles_estimator.

johnnin commented 4 years ago

Thanks, then after making the eye detector class and having the coordinates of the left and right eye what kind of operation would you think I have to make in the "calculate alpha" def in order to keep the gaze centered? Sorry if I ask you so many question but I'm just at the beginning to learning python

chihfanhsu commented 4 years ago

Your problem is not related to the python but related to the algorithm. I suggest that you should read the paper first :P

To briefly mention about content in the paper, for the current system, I assume that people only want to make eye contact when they gaze at the faces shown on the screen. Therefore, the gaze is redirected from the face shown on the screen to the camera.

Your question is more like another conversation scenario, you want to redirect the gaze to the camera from anywhere. To do so, You need to real-time detect the current gaze direction. Once the direction is detected, the intersection of the gaze direction and the screen surface can be calculated. Then, the correction angle, alpha, can be estimated by the trigonometric function to redirect the gaze from anywhere to the camera.

johnnin commented 4 years ago

Yes, you're right, I'll read the paper, btw mine was just curiosity, sorry for all the trouble and thank you so much for the help

chihfanhsu commented 4 years ago

Great discussion! I am enjoying it:)