Depth estimation results vary too much w.r.t the image resolution

QingXIA233 commented 2 years ago

Hello, impressive work indeed! 👏 I use the live_demo.py to estimate face depth of some images. When using an image with high resolution, I got really great result: 16_depth

However, when I try to use an image with resolution 256x256, the result is not quite precise: testimg_21 img_21_depth

I tried other images with resolution 256x256, all of them got pretty ugly results, I wonder whether this model is only for images with high resolution or there is any solution to solving the problems I encountered. I'd be so grateful if anyone could give me a hint on this.

khan9048 commented 2 years ago

Hi There,

Thanks for using our work, I am not sure about what would be the exact problem but I guess as the models are completely trained on synthetic data and yes indeed its generalization is always an issue for these models especially in depth estimation in real images. I have the same problem while in the testing process on low resolution but comparatively the results are better than the rest of the methods we used in our work.

You may use some higher resolution images if that works better in your case.

Regards Faisal

From: QingXIA233 @.> Sent: Tuesday, November 9, 2021 8:05 AM To: khan9048/Facial_depth_estimation @.> Cc: Subscribed @.***> Subject: [khan9048/Facial_depth_estimation] Depth estimation results vary too much w.r.t the image resolution (Issue #2)

EXTERNAL EMAIL: This email originated from outside of NUI Galway. Do not open attachments or click on links in the message unless you recognise the sender's email address and believe the content is safe. RÍOMHPHOST SEACHTRACH: Tháinig an ríomhphost seo as áit éigin taobh amuigh de OÉ Gaillimh. Ná cliceáil ar naisc agus ná hoscail ceangaltáin mura n-aithníonn tú seoladh ríomhphoist an tseoltóra agus mura gcreideann tú go bhfuil an t-ábhar sábháilte.

Hello, impressive work indeed! 👏 I use the live_demo.py to estimate face depth of some images. When using an image with high resolution, I got really great result: [16]https://user-images.githubusercontent.com/64100948/140884880-99b95921-cd5b-46ce-b601-d60bbaddc7ab.jpeg [16_depth]https://user-images.githubusercontent.com/64100948/140884950-8a627521-5130-4ec8-86f3-250760752d64.png

However, when I try to use an image with resolution 256x256, the result is not quite precise: [testimg_21]https://user-images.githubusercontent.com/64100948/140885235-38deb9c3-640e-4c8f-b095-5b04f7535a69.jpg [img_21_depth]https://user-images.githubusercontent.com/64100948/140885275-e6cbb625-ef1f-46d7-a5b8-86be1f8eae2c.png

I tried other images with resolution 256x256, all of them got pretty ugly results, I wonder whether this model is only for images for high resolution or there is any solution to solving the problems I encountered. I'd be so grateful if anyone could give me a hint on this.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/khan9048/Facial_depth_estimation/issues/2, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AL3UCTUX3WO3GGUF7U7IN2DULDI6PANCNFSM5HUTYT6Q. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

QingXIA233 commented 2 years ago

Hi, thank you a lot for replying. Yes, I also had the concern that there may be a gap between synthetic data and real world images. I will use face enhancement for my images to increase the resolution and then use the model to try again. However, I still have one more question about using the depth prediction(the numpy array prediction after running live_demo.py):

Using cv2.imwrite and plt.imsave got different results(left: cv2.imwrite right: plt.imsave) cv_img_27_depth

The depth values are different, too. I printed prediction and got: upper-left val is 72, center pt val is 127. Then I used plt.imread to load the same depth image and I got: upper-left val is 208, center pt val is 26. Therefore, if I want to use the depth information along with pixel values and camera intrinsics to get coordinates in camera frame, which one of the illustrated depth value should be used? I perceive that the result saved by plt is more reasonable. But both results are from the same numpy array prediction, so I got a little confused here. Could you please give me a clue? Thank you.

khan9048 commented 2 years ago

I think the plt results would be fine for that...

Get Outlook for Androidhttps://aka.ms/AAb9ysg

From: QingXIA233 @.> Sent: Wednesday, November 10, 2021 3:22:57 AM To: khan9048/Facial_depth_estimation @.> Cc: KHAN, FAISAL @.>; Comment @.> Subject: Re: [khan9048/Facial_depth_estimation] Depth estimation results vary too much w.r.t the image resolution (Issue #2)

EXTERNAL EMAIL: This email originated from outside of NUI Galway. Do not open attachments or click on links in the message unless you recognise the sender's email address and believe the content is safe. RÍOMHPHOST SEACHTRACH: Tháinig an ríomhphost seo as áit éigin taobh amuigh de OÉ Gaillimh. Ná cliceáil ar naisc agus ná hoscail ceangaltáin mura n-aithníonn tú seoladh ríomhphoist an tseoltóra agus mura gcreideann tú go bhfuil an t-ábhar sábháilte.

Hi, thank you a lot for replying. Yes, I also had the concern that there may be a gap between synthetic data and real world images. I will use face enhancement for my images to increase the resolution and then use the model to try again. However, I still have one more question about using the depth prediction(the numpy array prediction after running live_demo.py):

Using cv2.imwrite and plt.imsave got different results(left: cv2.imwrite right: plt.imsave) [cv_img_27_depth]https://user-images.githubusercontent.com/64100948/141042416-2f084c74-48aa-4a7e-8d5e-3ae5ee297bae.png[img_27_depth]https://user-images.githubusercontent.com/64100948/141042452-49b78ca3-ae85-4ddb-9949-27a463c94be0.png

The depth values are different, too. I printed prediction and got: upper-left val is 72, center pt val is 127. Then I used plt.imread to load the same depth image and I got: upper-left val is 208, center pt val is 26. Therefore, if I want to use the depth information along with pixel values and camera intrinsics to get coordinates in camera frame, which one of the illustrated depth value should be used? I perceive that the result saved by plt is more reasonable. But both results are from the same numpy array prediction, so I got a little confused here. Could you please give me a clue? Thank you.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/khan9048/Facial_depth_estimation/issues/2#issuecomment-964751583, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AL3UCTWSUP5FBVPEZ6FYYXDULHQRDANCNFSM5HUTYT6Q. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

khan9048 / Facial_depth_estimation

Depth estimation results vary too much w.r.t the image resolution #2