rocmarchive / realcaffe2

The repo is obsolete. Use at your own risk.
https://github.com/pytorch/pytorch
Apache License 2.0
12 stars 2 forks source link

Running MIOpen BN throws warning message #89

Closed ashishfarmer closed 6 years ago

ashishfarmer commented 6 years ago

Running SpatialBatchNorm op in MIOpen path generates a warning on the command line:

E0328 18:07:28.225716 127492 spatial_batch_norm_op_miopen.cc:38] Provided epsilon is smaller than MIOPEN_BN_MIN_EPSILON. Setting it to MIOPEN_BN_MIN_EPSILON instead.

This is happening because Caffe2 sets epsilon_ to 1e-5 which is same as MIOPEN_BN_MIN_EPSILON, which triggers the condition on ln 38 of spatial_batch_norm_opmiopen.cc CUDNN uses condition ```if (epsilon <= CUDNN_BN_MIN_EPSILON - FLTEPSILON) instead ofif (epsilon <= MIOPEN_BN_MIN_EPSILON)```

Need to investigate what is proper value of MIOPEN_BN_MIN_EPSILON and the condition

dagamayank commented 6 years ago

/cc @daniellowell

petrex commented 6 years ago

This epsilon is used to avoid div by zero; which mean any small enough value will do.
Cudnn introduces float epsilon here simply for floating point comparison. no functional difference IMHO

daniellowell commented 6 years ago

@petrex, I agree, no functional difference. You can copy what they do if you like. So long as the epsilon does its job.

petrex commented 6 years ago

I guess @ashishfarmer 's concern is the warning msg. I am going to set const double MIOPEN_BN_MIN_EPSILON = 1e-6; so we will not hit the condition when compare this value to the default epsilon(1e-5) from SpatialBNOp.

I am not expecting behavior change but if we miss anything let us know. thanks.

petrex commented 6 years ago

91

daniellowell commented 6 years ago

1e-6 should be fine.

ashishfarmer commented 6 years ago

The change in PR 91 fixes the issue. Thank you!