ZhenglinZhou / STAR

[CVPR 2023] STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
157 stars 17 forks source link

Performance speed up #21

Closed HeChengHui closed 10 months ago

HeChengHui commented 10 months ago

Thank you for your work. The results are very good but I realised that it can be quite slow.

~0.07s per frame

RTX2070Super mobile, 640x480 image from webcam

is there a way to speed up the inference process?

ZhenglinZhou commented 10 months ago

Hi, @HeChengHui. Thanks for your interest!

The bottleneck of inference time lies in dlib which is used to preprocess the image. Utilizing dlib with CUDA support could help improve this.

HeChengHui commented 10 months ago

@ZhenglinZhou

I used a different face detection library and removed the detection from the process function into somethin like:

def process(input_image, faces):
    image_draw = copy.deepcopy(input_image)

    results = []
    for face in faces:
        bbx = face.bbox.astype(int)

        x1,y1,x2,y2 = bbx
        scale = min(x2 - x1, y2 - y1) / 200 * 1.05
        center_w = (x2 + x1) / 2
        center_h = (y2 + y1) / 2

        scale, center_w, center_h = float(scale), float(center_w), float(center_h)
        landmarks_pv = alignment.analyze(input_image, scale, center_w, center_h)
        results.append(landmarks_pv)
        image_draw = draw_pts(image_draw, landmarks_pv)

    return image_draw, results

and i timed it using:

start = time.time()
image_draw, results = process(frame, faces)
ic('time taken', time.time()-start)

is the speed what you expect? i am using the WFLW model.

ZhenglinZhou commented 10 months ago

Hi @HeChengHui,

As discussed in Issue 16, we found STAR is sensitive to the preprocess alignment. I worry about that using detectors other than dlib could negatively impact its performance. If you want to use other face detector, it is recommended to retrain STAR.

Therefore, I suggest to use the CUDA-version dlib instead of the CPU-version dlib in demo.py.