keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
980 stars 321 forks source link

Weighted Boxes Fusion (WBF) layer #1724

Closed innat closed 11 months ago

innat commented 1 year ago

Short Description

Weighted Boxes Fusion (WBF): An algorithm that utilizes confidence scores of all proposed bounding boxes to constructs the averaged boxes.

157394027-bef984f3-30c5-4eba-9f82-52db310312b9

Papers

https://arxiv.org/abs/1910.13302 Cited by 217

Existing Implementations

Other Information

Comparatively better than Non-maximum Suppression (NMS), Soft-NMS, Non-maximum weighted (NMW).

cc. @ZFTurbo

srikesh-07 commented 1 year ago

Are you working on this issue @innat ?

If not, Can I make a PR for this issue ?

innat-asj commented 1 year ago

Can I make a PR for this issue?

You may start but wait for the green signal from the Keras team.

innat-asj commented 1 year ago

@mingxingtan Could you please confirm if the efficientdet/tf2/wbf.py code here is inspired from https://github.com/ZFTurbo/Weighted-Boxes-Fusion ? Couldn't find any info in efficientdet paper.

mingxingtan commented 1 year ago

The paper didn't use this. I think @LucasSloan added this for an ensemble project, but it showed limited gains and was not used anywhere in the end. I don't think you need to implement this for keras-cv unless you see benefits for certain cases.

innat-asj commented 1 year ago

@mingxingtan Thanks for the info. For the use case, WBF, is extremely popular in the Kaggle, for object detection competition.

mingxingtan commented 1 year ago

Good to know! Lucas' implementation was not well tested or maintained. I would suggest you implementing your WBF from scratch (possibly borrow ideas/code from successful Kaggle projects or other well established projects) instead of relying on efficientdet/tf2/wbf.py

Could you please keep me posted if you implement a good version of WBF with reasonable gains?

innat-asj commented 1 year ago

Could you please keep me posted if you implement a good version of WBF with reasonable gains?

Sure. :)


(We can discuss it later, just mentioning). I've just noticed another cool feature in efficietdet code, which is gradient-checkpointing. It is mentioned that here, this allows the d6 network to run with a batch size of 2 on an 11Gb (1080Ti) GPU, cool stuff.

I've posted a discussion on tf-forum. Grad checkpointing is a very attractive feature. It would be great to have official support for this. cc. @LukeWood @ianstenbit

LucasSloan commented 1 year ago

WBF works best for uncorrelated inputs - averaging together detections from different models. If you’re trying to get more out of a single model by leveraging WBF over augmented views of the same image with the same model, there’s much less it can do, because all the errors are correlated.

WBF gave a decent improvement for the smallest model, but ~no improvement for the larger ones. Even then, I think the issue was that the smallest model natively uses a too small image size, and by using a larger view of the image as one of the augmentations, it did better.

On Sun, Apr 23, 2023 at 10:06 AM M.Innat @.***> wrote:

Could you please keep me posted if you implement a good version of WBF with reasonable gains?

Sure. :)

(We can discuss it later, just mentioning). I've just noticed another cool feature in efficietdet code, which is gradient-checkpointing https://github.com/google/automl/blob/master/efficientdet/tf2/efficientdet_keras.py#L427. It is mentioned that here https://github.com/google/automl/tree/master/efficientdet#11-reducing-memory-usage-when-training-efficientdets-on-gpu, this allows the d6 network to run with a batch size of 2 on an 11Gb (1080Ti) GPU, cool stuff.

I've posted a discussion on tf-forum https://discuss.tensorflow.org/t/support-gradient-checkpointing-in-tensorflow-2/15405. Grad checkpointing is a very attractive feature. It would be great to have official support for this. cc. @LukeWood https://github.com/LukeWood @ianstenbit https://github.com/ianstenbit

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-cv/issues/1724#issuecomment-1519112159, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAY7KNL5XCMSFYY2FKJFP3XCVOR3ANCNFSM6AAAAAAXGEAGEU . You are receiving this because you were mentioned.Message ID: @.***>

ianstenbit commented 1 year ago

I'll leave this to @LukeWood to decide if we're open to a contribution of this component.

From @LucasSloan's comment, it seems like it may not be particularly relevant for our use cases? But feel free to correct me!

innat-asj commented 1 year ago

Adding to lucas, this WBF is particularly an ensemble algorithm and most effective for object detection tasks with different models. That may answer the reason for the high popularity of Kaggle premises. Some records in Kaggle, wbf, used in most of the top solutions of global-wheat-detection, TF - Great Barrier Reef, COVID-19 Detection, VinBigData Chest X-ray