Open Pixie8888 opened 4 months ago
Hi,
Although I set clip_grad as below, but the grad_norm still explodes... Can anyone help me?
optim_wrapper = dict( type='OptimWrapper', optimizer=dict( _delete_=True, type='AdamW', lr=0.0001, weight_decay=0.0001), paramwise_cfg=dict( custom_keys={'backbone': dict(lr_mult=0.1, decay_mult=1.0)}), clip_grad=dict(max_norm=35., norm_type=2))
Hi,
Although I set clip_grad as below, but the grad_norm still explodes... Can anyone help me?