AUTOMATIC MIXED PRECISION

TuSimple / centerformer

Implementation for CenterFormer: Center-based Transformer for 3D Object Detection (ECCV 2022)

MIT License

292 stars 28 forks source link

AUTOMATIC MIXED PRECISION #33

Open zhaowenZhou opened 1 year ago

zhaowenZhou commented 1 year ago

Has anyone tried torch.cuda.amp? Seems that ms_attention doesn't support fp16 even after I modified ms_deform_attn_forward_cuda Any other way to implement amp? Or is there any ways to reduce the GPU memory? I got cuda OOM for bs=4 every time