yfeng95 / DECA

DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)
Other
2.11k stars 421 forks source link

Problem with texture alignment #29

Open fefogarcia opened 3 years ago

fefogarcia commented 3 years ago

Hi folks, thank you for this mind-blowing model!!

I've been trying to use the reconstruction demo and while the geometry reconstruction works like magic, the texture is always misaligned, especially around the eyes. At first I thought it was an issue with the custom image I was using but then when I attempted one of the sample images provided I had the same result.

I then thought it might have been related to the low-poly, pre-displacement OBJ I was using, but then after fumbling around a bit with the normal map with no luck, I opened the high-detail OBJ to check the vertex colors and it turns out they were also misaligned in the same way

Have you guys seen this problem before? Maybe there's something in the MAT file I should be applying? Thank you in advance for any light you can shine on the issue!

image

vladyslavmos commented 3 years ago

Same issue. Maybe someone know how to fix it?

Dian-Yi commented 3 years ago

Same issue. Maybe someone know how to fix it?

mosvlad commented 3 years ago

As I understand it, the problem is as follows.

For face-aligment this project use FAN. And it's project is not so good for detecting narrowed eyes - issue

image

For example :

image

image

Module for detecting facial landmarks here

Maybe someone know more robust project for detecting facial landmarks? Please share it, and we can implement it for this project.

fefogarcia commented 3 years ago

Hi @mosvlad, that is great insight! Now that you said that I took a closer look at the README example that uses this same photo from mine and the eyes are indeed misaligned – this whole time I had assumed they were correctly placed.

Example from README

Re: landmark detection, I have used MTCNN with good results when I've run Microsoft's Deep 3D Facial Reconstruction model. Maybe it could become an option here to run MTCNN as an intermediate step to generate the landmarks before running DECA?

mosvlad commented 3 years ago

I think i can implement model from Deep3DFaceReconstruction for detecting facial landmarks. Code look's simple for use.

code for facial landmark detection from microsoft/Deep3DFaceReconstruction

Use model

#detect 68 face landmarks for aligned images
def get_68landmark(img,detector,sess):

    input_img = detector.get_tensor_by_name('input_imgs:0')
    lm = detector.get_tensor_by_name('landmark:0')

    landmark = sess.run(lm,feed_dict={input_img:img})
    landmark = np.reshape(landmark,[68,2])
    landmark = np.stack([landmark[:,1],223-landmark[:,0]],axis=1)

    return landmark

Load model

    with tf.Graph().as_default() as graph, tf.device('/gpu:0'):
    lm_detector = load_graph(os.path.join('network','landmark68_detector.pb'))
    tf.import_graph_def(lm_detector,name='')
    sess = tf.InteractiveSession()

    for file in img_list:

        print(file)
        name = file.split('/')[-1].replace('.png','').replace('.jpg','')
        img,lm5p = load_img(file,file.replace('png','txt').replace('jpg','txt'))
        img_align,_,_ = align_img(img,lm5p,lm3D)  # [1,224,224,3] BGR image

        lm68p = get_68landmark(img_align,graph,sess)
        lm68p = lm68p.astype(np.float64)
        skin_mask = get_skinmask(img_align)

I will announce the results shortly. Thanks for the info on this project.

Zju-George commented 3 years ago

@mosvlad Is the MS repo for 68 landmarks detection better? It would be wonderful if you share some results of MS results.

mosvlad commented 3 years ago

@Zju-George I think MS project may show a more accurate result for the narrowed eyes. In the near future I will try to share the results of comparison between MS and FAN

mosvlad commented 3 years ago

@Zju-George @fefogarcia Some results.

image 68 facial landmark model from MS repo - is 2D-and-3D-face-alignment This repo is lua impementation of FAN :)

But I'm still trying to find robust and accuracy model for 68 facial landmark. Next step, i'll try to comparse FAN and dlib

It looks pretty accuracy

image

mosvlad commented 3 years ago

DLIB facial landmark detector is much better than FAN for frontal images. Here sample of code for detect landmarks with DLIB. I'm use GoogleColab for tests.

For use download shape_predictor_68_face_landmarks.dat

from imutils import face_utils
import numpy as np
import argparse
import imutils
import dlib
import cv2

from google.colab.patches import cv2_imshow

detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")

image = cv2.imread("test.jpg")
image = imutils.resize(image, width=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

rects = detector(gray, 1)

for (i, rect) in enumerate(rects):
    shape = predictor(gray, rect)
    shape = face_utils.shape_to_np(shape)

    (x, y, w, h) = face_utils.rect_to_bb(rect)
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

    cv2.putText(image, "Face #{}".format(i + 1), (x - 10, y - 10),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    for (x, y) in shape:
        cv2.circle(image, (x, y), 1, (0, 0, 255), -1)

cv2_imshow(image)

cv2.waitKey(0)

image

Next step implement DLIB landmarks for this project :)

mosvlad commented 3 years ago

Some interest result of my research about align eyes texture.

image image image image

I think I am close to a solution. I will give more information about this work soon.

timlod commented 3 years ago

This project doesn't use landmark detection for anything other than getting a bounding box for the face - thus I think getting better landmark detection is not as important as you may think. You can investigate this in datasets.py - see how the box is used to compute a center and size in bbox2point, and then used to estimate and compute a transform on the image.

Internally, DECA computes landmarks itself, which is what is shown in the output. For training, the paper states that they used FAN to create ground-truth landmarks for all training data, so bad results in the trained model probably arise from bad landmarks used as ground truth in the training set.

mosvlad commented 3 years ago

@timlod Yes, everything is exactly as you wrote it. At first I thought that the face points are used directly to generate the model, but after more detailed analysis I realized that the points are not used directly. But 2D points are used when projecting the texture onto the finished model, at this stage you can use the facial point detector to correct the input image and the eyes were projected to the right place.

Here is reult for sample image:

image image

mjoach commented 3 years ago

@mosvlad, just run into the same issue and confirmed your findings, thanks for sharing your experiences. But how did you correct UV texture with externally provided landmarks above? Based on my understanding of the source, DECA itself doesn't use landmarks when unwrapping input to the UV texture so there is no single point to feed it proper landmarks. Am I wrong on this one? Or are you applying some corrective stretching to the input image/unwrapped texture?

timlod commented 3 years ago

@mosvlad I'd also be interested in how you used custom landmarks for the texture projection!

MokzZ commented 2 years ago

@mosvlad Is there a chance that you could share the code/modifications you used to get the final result? I came across the same issue of misaligned eyes and your solution could really help me with a research project I’m working on.

birdflies commented 2 years ago

@mosvlad Is there a chance that you could share the code/modifications you used to get the final result? I came across the same issue of misaligned eyes and your solution could really help me with a research project I’m working on.

Hi, Do u solve this problem~

MokzZ commented 2 years ago

@birdflies I think I figured out how @mosvlad solved it. Or at least I found a similar solution. In deca.py, in the def decode() section, landmarks2d (line 174) is the Torch tensor containing the landmarks used to show the visualization landmarks in the generated ..._vis.jpg files. These landmarks can be used to create bounding boxes around the eyes. Then implementing the DLIB landmark detector and using the landmarks to create different bounding boxes around the eyes, you can copy the texture of the eyes found by DLIB to the location where DECA thinks the eyes are. I use opencv to do the copying. You can then overwrite the image loaded into the visdict (line 245), which is the image used to create the model texture. It is not the cleanest solution, but it does the job for me at least.

birdflies commented 2 years ago

@birdflies I think I figured out how @mosvlad solved it. Or at least I found a similar solution. In deca.py, in the def decode() section, landmarks2d (line 174) is the Torch tensor containing the landmarks used to show the visualization landmarks in the generated ..._vis.jpg files. These landmarks can be used to create bounding boxes around the eyes. Then implementing the DLIB landmark detector and using the landmarks to create different bounding boxes around the eyes, you can copy the texture of the eyes found by DLIB to the location where DECA thinks the eyes are. I use opencv to do the copying. You can then overwrite the image loaded into the visdict (line 245), which is the image used to create the model texture. It is not the cleanest solution, but it does the job for me at least.

Thanks, do you solve problems : "the results are all black except for the face?"?

MokzZ commented 2 years ago

@birdflies I haven't come across that problem myself. I use the --useTex argument while running and it succesfully generates a full uv texture map.

vedantdere commented 2 years ago

@birdflies I haven't come across that problem myself. I use the --useTex argument while running and it succesfully generates a full uv texture map.

Please can you share the model which you have used to generate full texture map

VictoryLoveJessica commented 2 years ago

I think it can be solved by new face detection DECA_new

emlcpfx commented 1 year ago

I think it can be solved by new face detection DECA_new

Can you please elaborate? I'd love to implement this with EMOCA.

emlcpfx commented 1 year ago

@timlod Yes, everything is exactly as you wrote it. At first I thought that the face points are used directly to generate the model, but after more detailed analysis I realized that the points are not used directly. But 2D points are used when projecting the texture onto the finished model, at this stage you can use the facial point detector to correct the input image and the eyes were projected to the right place.

Here is reult for sample image:

image image

Hi Vlad, Can you please elaborate on your findings? How did you get this to work?

emlcpfx commented 1 year ago

@yfeng95 Can you please weigh in on the technique that @mosvlad is proposing?

emlcpfx commented 1 year ago

I found a hacky fix for how this eye alignment was affecting our work. Check it out here: https://github.com/radekd91/emoca/issues/61#issuecomment-1683007725

There are a lot of very smart people on this board, maybe you can weigh in on how I could do this better.