Closed artbataev closed 6 years ago
That could maybe explain why I'm not able to reproduce the same behavior with my Python network as with my BrainScript network... at least if Python and BrainScript do not use the same implementation of batch normalization layer.
Edit: I solved my problem...it has nothing to do with this issue.
Thanks for reporting this. The implementation of ONNX BatchNormalization op in CNTK was updated recently to match the latest opset 6 spec as part of this commit (https://github.com/Microsoft/CNTK/commit/cf839dcdb5f821b854353b1246ef27b1003ea944) .
The numbers are now consistent between CNTK and Pytorch.
Pytorch output: ('Some input: ', [[[[-0.9206454753875732, -2.3230578899383545]]]]) Batch norm parameters: running_mean [0.] | running_var [1.] | weight [0.31872553] | bias [0.] ('Pytorch output: ', [[[[-0.29343175888061523, -0.7404141426086426]]]])
CNTK output: Cntk output: [[[[-0.29343172907829285, -0.7404140830039978]]]]
This change should be in the next release (CNTK2.6). You can try it from the latest master today.
If you are still seeing some discrepancy, please reopen this issue.
There is some strange behavior of Batch Normalization, while importing model from Pytorch (0.3.1) to CNTK (2.4) using ONNX: most of the outputs are 0.
Very simple code to reproduce problem:
Output:
Pytorch output is correct, but for CNTK one element is 0 (the same behavior is when dealing with larger inputs - only first elements in tensors are correct)