caffe 0.17 l2 norm grows to inf

ubuntu 16.04.4 cuda v8.0.16 gtx 1080 ti

prototxt default_forward_type: FLOAT16 default_backward_type: FLOAT16 default_forward_math: FLOAT16 default_backward_marh: FLOAT16 global_grad_scale: 0.09 global_grad_scale_adaptive:true

solver.prototxt clip_gradients:150

A auto_encoder net, IN BVLC caffe, l2 norm value less than 500, but nvcaffe0.17,the l2 norm grow slowly from 150 to inf, then i get the nan loss.

FLOAT32 format is the same as FLOAT16 if not setting global_grad_scale_adaptive:true and global_grad_scale: 0.09 , l2 norm grows more quickly to inf

NVIDIA / caffe

caffe 0.17 l2 norm grows to inf #570