Leverage Accelerate for object detection/segmentation models

NielsRogge commented 8 months ago

Feature request

Currently there are 6 object detection models which don't support multi-GPU training out-of-the-box. The distributed code was explicitly left out of the modeling code as they wouldn't be compatible with the Trainer API. Refer to these lines of code as an example.

However, now that the Trainer class uses 🤗 Accelerate behind the scenes, we can include it now by leveraging the following code:

from accelerate import PartialState
from accelerate.utils import reduce

# Check that we have initialized the distributed state
world_size = 1
if PartialState._shared_state != {}:
    num_boxes = reduce(num_boxes)
    world_size = PartialState().num_processes

See this commit as an example: https://github.com/huggingface/transformers/pull/27990/commits/526a8b0801d075ad5f99e87fbfc5de49ea347a9a.

I'll add a list here with models to be fixed:

[ ] DETR
[ ] Conditional DETR
[ ] Deformable DETR
[ ] YOLOS
[ ] Table Transformer
[ ] DETA

Additionally, there are 3 segmentation models which require a similar update:

[ ] MaskFormer
[ ] Mask2Former
[ ] OneFormer.

For these, the get_num_masks function requires an update similar to what is present in the original repository, using Accelerate.

Motivation

Would be great to support multi-GPU training of these models leveraging Accelerate

Your contribution

I can do this but this is a perfect opportunity for a first open-source contribution

sam99dave commented 8 months ago

Hi @NielsRogge, I would like to take this up

ENate commented 8 months ago

Hi @NielsRogge is this open for just 1 person taking it up as @sam99dave already indicated or multiple contribution? If it is the latter, I can take one of this up.

NielsRogge commented 7 months ago

Hi, it looks like the PRs above addressed all models at once, which is fine for me.

NielsRogge commented 6 months ago

Closing as the PR above was merged.

huggingface / transformers