IDEA-Research / detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
https://detrex.readthedocs.io/en/latest/
Apache License 2.0
2k stars 206 forks source link

Question: How come the MaskDINO Encoder can't use half-precision (AMP) training? #268

Open numpee opened 1 year ago

numpee commented 1 year ago

So there's a decorator in the MaskDINOEncoder that disabled FP16 casting here. A few lines below a comment states that deformable-DETR does not support half precision (in Line 357). Do you know why, and is there possibly a workaround? Perhaps if we can isolate the problem to specific operations, all other operations inside the encoder can still run in half-precision

rentainhe commented 1 year ago

So there's a decorator in the MaskDINOEncoder that disabled FP16 casting here. A few lines below a comment states that deformable-DETR does not support half precision (in Line 357). Do you know why, and is there possibly a workaround? Perhaps if we can isolate the problem to specific operations, all other operations inside the encoder can still run in half-precision

Sorry for the late reply, for fp16 training with custom operator like DeformableAttention, which doesn't support fp16 training now, you should convert the input into fp32 and after the forward pass of Deform.Attn, you can convert them back to fp16 to skip this operator manually, here is an example:

https://github.com/IDEA-Research/detrex/blob/f34380d3652c0dbc8f9de911626117943a8208c4/detrex/layers/multi_scale_deform_attn.py#L343

@numpee