Hi there, thanks for your work, it's really inspiring!
I notice that in synthesis network, you use multiple MFB to mix feature maps, and in the MFB, there is a multiple computation.
With so many feature map in this computation, how can you provide from the NaN problem when training?
Because mul especially in 3D Volumn, might cause overflow problem and I use MFB in mixed precision mode.
Hi there, thanks for your work, it's really inspiring! I notice that in synthesis network, you use multiple MFB to mix feature maps, and in the MFB, there is a multiple computation. With so many feature map in this computation, how can you provide from the NaN problem when training? Because mul especially in 3D Volumn, might cause overflow problem and I use MFB in mixed precision mode.
Thanks a lot!