google-research / pix2seq

Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
Apache License 2.0
857 stars 71 forks source link

RIN training with float16/bfloat16 #47

Closed nicolas-dufour closed 1 year ago

nicolas-dufour commented 1 year ago

I've been trying to train RIN with mixed precision, but it fails for some reason. Have you tried to train with mixed precision? If so, would you happen to have some recommendations to stabilize training?

Thanks!

image Left: bfloat16; right: float32

chentingpc commented 1 year ago

Our model was not trained in float16.

On Sat, Sep 2, 2023 at 1:36 AM nicolas-dufour @.***> wrote:

I've been trying to train RIN with mixed precision, but it fails for some reason. Have you tried to train with mixed precision? If so, would you happen to have some recommendations to stabilize training?

Thanks!

[image: image] https://user-images.githubusercontent.com/33259879/265192491-40ca1667-5c2d-4d89-9df8-18dc72bb5378.png Left: float16; right: bfloat16

— Reply to this email directly, view it on GitHub https://github.com/google-research/pix2seq/issues/47, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKERUMWYHJZYL55VTUVJZ3XYLVYJANCNFSM6AAAAAA4IOPKVA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

nicolas-dufour commented 1 year ago

Ok thanks