andrewhou1 / GeomConsistentFR

Official Code for Face Relighting with Geometrically Consistent Shadows (CVPR 2022)
https://openaccess.thecvf.com/content/CVPR2022/html/Hou_Face_Relighting_With_Geometrically_Consistent_Shadows_CVPR_2022_paper.html
MIT License
113 stars 16 forks source link

About the baseline in the paper and the qualitative results. #9

Closed XezXey closed 1 year ago

XezXey commented 1 year ago

I would like to ask about the baseline of the paper and the qualitative results.

  1. Nestmeyer et al. (Learning Physics-guided Face Relighting under Directional Light) and SIPR

    • How do you get the score on MultiPIE dataset from this baseline? Did you get to reimplement it on your own or ask for their code? because I couldn't find their GitHub page.
  2. SfsNet and DPR

    • Do you use their pre-trained on their GitHub page for an evaluation?
  3. Does your pre-trained in this GitHub page is the same model for reported results on paper? also your work from CVPR2021 (Towards High Fidelity Face Relighting with Realistic Shadows)

Thank you very much

andrewhou1 commented 1 year ago

Sure, these are indeed very relevant questions.

  1. In both cases, we emailed the first author. For Nestmeyer et al., the author provided us with their code and I ran it myself. For SIPR, the first author was willing to run our images through their model although is not able to share the model directly.
  2. For DPR, yes. For SfSNet you probably can also use the pre-trained model although I personally emailed the first author and they handed me a Pytorch implementation of the same model which is much easier to use.
  3. Yes for both.
XezXey commented 1 year ago

Thank you for a very quick answer. This is very useful. I will further email them for more information.

XezXey commented 1 year ago

Hello, I would like to ask for some more information about the testing procedure and test set

  1. In the paper, the evaluation has been done using M_depth to mask out the non-face region. So does this correlate to the variable depth in this line of code ?https://github.com/andrewhou1/GeomConsistentFR/blob/5448302eab8d3ad01ea49897f734c65744c64e4a/test_relight_single_image_lighting_transfer.py#L543
  2. Is it the same of using face-parsing?
  3. How do you convert the depth map into the binary mask for masking the face out? (Is it done by thresholding where depth > 0?)
  4. For the test set, is it possible to share the same test set you've used in the paper? or could you release the index or reference to the same test set you've used?

Thank you in advance

andrewhou1 commented 1 year ago

Hello again! So for the mask, using face parsing is fine. Just be sure to include only the skin (we do not model hair or the background). For our results, we ended up using face masks derived from face parsing instead of depth masks since the depth masks leave out part of the forehead. You'll notice this as well if you follow our code all the way through. For the test set, unfortunately we cannot share Multi-PIE with you due to its license. You will need to request it on your own. However, if you're able to acquire it, I'd be happy to share the image IDs that I used for testing.

XezXey commented 1 year ago

Hi! thank you for your answer; now it's clear to me that face parsing has been used for evaluation. I would like to ask another question. Which way that you've requested for the Multi-PIE dataset? I've tried to reach their site and email, but I get no response. Could you help point me out with this? Thank you in advance!

andrewhou1 commented 1 year ago

Hi, you can try using this link to order the dataset: https://cmu.flintbox.com/technologies/67027840-27d5-4570-86dd-ad4715ef3c09

Hope this works for you!

XezXey commented 1 year ago

Thank you very much, I've tried to reach them via your links and waiting for their response. Could you drop the image IDs for validation and the test set you've used in the evaluation? and I have a question about the resolution for testing. Since the SfsNet output is 128x128 resolution, did you downsample your method from 256x256 for an evaluation? and also for the baseline of the other too?

andrewhou1 commented 1 year ago

For any methods like SfSNet that output a different resolution, I resized their prediction to 256x256 for evaluation. I've attached the image IDs for both the input images and groundtruth relit images that I used for my target lighting experiment (Table 2 in the paper). target_light_groundtruth.txt target_light_input.txt

XezXey commented 1 year ago

Thank you very much!!, and I would like to ask that does this use the same subject for the lighting transfer experiment (Table 3. in the paper). Or the subjects for the lighting transfer experiment are different? Could you released those subject IDs also (in case of each experiment used different subject IDs)

andrewhou1 commented 1 year ago

Hi there, so for the lighting transfer images there are actually 3 sets of images: input images, reference images, and groundtruth images. The lighting needs to be estimated from the reference image first and then applied to the input image to try to produce the groundtruth image. All 3 image sets were randomly sampled and the only constraint is that the input image and groundtruth image should be the same subject and different lightings. The reference image can be any subject that is different from the input/groundtruth subject and must have the same lighting as the groundtruth image. It would seem though that when randomly sampling the reference images that I named them the same as the input images for sorting convenience so it might be easier if you simply resample a new lighting transfer dataset instead. I'm currently quite occupied with an upcoming deadline and don't really have time to figure out the reference image IDs. However, the lighting transfer dataset I generated was indeed the same size as the target lighting experiment in terms of the number of image triplets with the same subject distribution as well.

XezXey commented 1 year ago

thank you for your information!!