ChenRocks / UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
https://arxiv.org/abs/1909.11740
777 stars 109 forks source link

About fp16 and location features. #93

Closed EstherBear closed 2 years ago

EstherBear commented 2 years ago

Hi, I have a problem in training.

  1. The last elements of location features (w x h in [x1, y1, x2, y2, w, h, w × h]) will cause overflow in fp16 training mode. So how do you train in fp16?
  2. And what is the meaning of "normalized" in the location feature? Could you provide an example?

Thanks!