jingyuanli001 / RFR-Inpainting

The source code for CVPR 2020 accepted paper "Recurrent Feature Reasoning for Image Inpainting"
MIT License
358 stars 76 forks source link

FP16 training. #8

Closed ternaus closed 3 years ago

ternaus commented 4 years ago

Trying to train the model using the fp16 setting,

but the model outputs -inf's during the forward pass.

alexwitt23 commented 4 years ago

I recommend changing this and this line to more easily digestible numbers in fp16. Fixed the issue for me (using APEX). You'll probably need to do some similar tricks in the style_loss function.

xiankgx commented 3 years ago

@alexwitt2399 ,

Hi, how is your experience training with fp16? I made some changes to the code following your suggestions and I was able to train for quite a long time without NaN. However, NaN eventually still pop out due to multiple recurrence and partial conv. Setting epsilon to a higher number like 1e-4 still didn't help. Also, the output images is very "fish scale-ly" or having "studded patterns" when training with fp16. Changing to fp32 eventually solves most of those problems.