Change the implementation for NMS

woctezuma commented 2 years ago

Fix #9 by switching to torchvision's implementation of NMS.

I have tested the code on Google Colab with a GPU, using:

branch = "nms"

%cd /content
!git clone https://github.com/woctezuma/facexlib.git

%cd /content/facexlib
!git checkout {branch}

%pip install --quiet -e .

The code works fine with this example with 3 faces:

!wget -O /content/input.jpg https://raw.githubusercontent.com/ternaus/retinaface/master/tests/data/13.jpg

!python inference/inference_detection.py --img_path /content/input.jpg

The code also works fine with the default example with 1 face:

!python inference/inference_detection.py --img_path /content/facexlib/assets/test.jpg

Ditto with a more extreme case with many faces:

!wget -O /content/crowd.jpg https://i.imgur.com/j4KK3i4.jpeg

!python inference/inference_detection.py --img_path /content/crowd.jpg

I have timed both implementations:

after downloading the model checkpoint,
by repeating the runs 10 times each,
using the following kind of code:
```
import time
```

start_time = time.time() for i in range(10): !python inference/inference_detection.py --img_path /content/input.jpg > /dev/null print("--- %s seconds ---" % (time.time() - start_time))



Results:
- using the `master` branch:
   - using `input.jpg`: 41.409 seconds
   - using `test.jpg`: 40.496 seconds
   - using `crowd.jpg`: 62.800 seconds
- using the `nms` branch:
   - using `input.jpg`: 41.702 seconds (a bit worse)
   - using `test.jpg`: 40.405 seconds (a bit better)
   - using `crowd.jpg`: 63.107 seconds (a bit worse)

So the run time is extremely similar, at least for cases with a few faces.

**However**, I would expect to see some benefits in some scenarii, right? :)
And no drawback as the package requirement was already there.

xinntao commented 2 years ago

Thanks @woctezuma

Do those two ways generate the same results?

woctezuma commented 2 years ago

Are there some test images which I could use for tests?

If not, then I will check these later with the 3 images (chosen arbitrarily) above?

xinntao commented 2 years ago

You can use the above three images :smile:

You may check whether the modified codes generate the same results as the original ones.

woctezuma commented 2 years ago

If I run:

branch = "master"
# branch = "nms"

%mkdir -p /content/output

%cd /content/facexlib/
!python inference/inference_detection.py --img_path /content/input.jpg --save_path /content/output/0.jpg
!python inference/inference_detection.py --img_path /content/facexlib/assets/test.jpg --save_path /content/output/1.jpg
!python inference/inference_detection.py --img_path /content/crowd.jpg --save_path /content/output/2.jpg

%cd /content
!tar -czvf {branch}.tar.gz output

Then I get the same log in the terminal for both branches:

/content/facexlib

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
[[439.15402    121.86074    509.0824     210.8193       0.998623
  460.9657     156.37442    493.47168    155.48595    479.6446
  176.61067    464.07825    188.66666    491.31152    188.06332   ]
 [656.3072     155.60214    730.38525    246.72495      0.99776065
  665.4625     186.43346    696.70667    197.98117    669.53876
  214.66125    665.87854    220.91127    689.0557     229.59206   ]
 [258.05493     92.80016    336.99622    203.68082      0.99744344
  285.99197    135.71637    321.85596    136.39897    307.07513
  162.35631    281.62823    168.6027     318.6401     169.43341   ]]

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
[[117.37251    39.324093  250.56491   224.72816     0.9998591 145.25105
  122.97755   201.31013   111.52025   173.76721   155.68156   161.52567
  186.27234   211.38899   177.11263  ]]

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
[[9.1340167e+02 1.3909541e+03 1.0317024e+03 ... 1.5051721e+03
  1.0065583e+03 1.4891031e+03]
 [2.1674930e+02 1.5291836e+03 3.2775998e+02 ... 1.6428296e+03
  2.9815350e+02 1.6407316e+03]
 [2.0928545e+03 8.9630255e+02 2.2034666e+03 ... 9.8878503e+02
  2.1741155e+03 9.7969580e+02]
 ...
 [2.7693622e+02 8.7479419e-01 3.5184836e+02 ... 2.8165215e+01
  3.2834222e+02 2.7398485e+01]
 [4.4819373e+02 7.4466667e+01 5.0636578e+02 ... 1.4170441e+02
  4.7893454e+02 1.4418686e+02]
 [1.8024073e+03 1.8553757e+03 1.8900052e+03 ... 1.9470372e+03
  1.8326847e+03 1.9524645e+03]]

/content
output/
output/1.jpg
output/2.jpg
output/0.jpg

xinntao commented 2 years ago

@woctezuma Thanks you very much 👍

I will merge it to the master branch 😄

woctezuma commented 2 years ago

I also get the same output images.

Input Test Crowd

xinntao / facexlib

Change the implementation for NMS #10