1adrianb / face-alignment

:fire: 2D and 3D Face alignment library build using pytorch
https://www.adrianbulat.com
BSD 3-Clause "New" or "Revised" License
6.88k stars 1.33k forks source link

Speedup face detection moudle by parallelizing `get_predictions()` #347

Closed SCZwangxiao closed 10 months ago

SCZwangxiao commented 11 months ago

Background

I was processing large-scale human-talking datasets (~ 10M images), and found the GPU utilization rate is very low (below 10%) even using batch API.

As discussed in #343, I found the bottleneck to be the unparallelized get_predictions() after profiling the code.

I've solved this issue by proposing a parallelized implementation. A detailed explanation is below.

Explanation

def get_predictions_original(olist: List[np.ndarray], batch_size: int) -> np.ndarray:
    bboxlists = []
    variances = [0.1, 0.2]
    for j in range(batch_size):
        bboxlist = []
        for i in range(len(olist) // 2):
            ocls, oreg = olist[i * 2], olist[i * 2 + 1]
            stride = 2**(i + 2)    # 4,8,16,32,64,128
            poss = zip(*np.where(ocls[:, 1, :, :] > 0.05))
            for Iindex, hindex, windex in poss:
                axc, ayc = stride / 2 + windex * stride, stride / 2 + hindex * stride
                score = ocls[j, 1, hindex, windex]
                loc = oreg[j, :, hindex, windex].copy().reshape(1, 4)
                priors = np.array([[axc / 1.0, ayc / 1.0, stride * 4 / 1.0, stride * 4 / 1.0]])
                box = decode(loc, priors, variances)
                x1, y1, x2, y2 = box[0]
                bboxlist.append([x1, y1, x2, y2, score])

        bboxlists.append(bboxlist)

    bboxlists = np.array(bboxlists)
SCZwangxiao commented 11 months ago

The unitest test/facealignment_test.py has failed, but it succeeded in my env. That's strange.

emlcpfx commented 10 months ago

Hi, @SCZwangxiao I got a 10% boost in performance using V1. V2 throws an error for me about thr.

TypeError: get_predictions() missing 1 required positional argument: 'thr'

Any thoughts on how I can get that working?

SCZwangxiao commented 10 months ago

Hi, @SCZwangxiao I got a 10% boost in performance using V1. V2 throws an error for me about thr.

TypeError: get_predictions() missing 1 required positional argument: 'thr'

Any thoughts on how I can get that working?

Sorry for the typo. I've update the correct version of V2 code.

thr refers to the 0.05 in poss = zip(*np.where(ocls[:, 1, :, :] > 0.05)). We use thr to filter low-confidence candidates in our private project.

emlcpfx commented 10 months ago

Thanks. It works now! I'm getting slightly faster results with v1 than v2. They're both faster than the original.

1adrianb commented 10 months ago

Thanks for your contribution @SCZwangxiao , looks good! Will check what is going on with the test, seams to be fine locally indeed.