Open aosadchyy opened 2 years ago
It seems the problem still exists when I tried to convert a tensorflow resnet152
model into onnx on s390x (aka LinuxOne). Are there any possible solution or workaround for this problem?
I have debugged this issue (I think). Invocations of Node.get_tensor_value()
in tf2onnx result in calls to onnx's numpy_helper.to_array()
, which will perform a byteswap on the data. If there is a means to have ONNX not perform the byteswapping, then it would be as simple as fixing up all of these invocations to not perform the byteswapping.
This should now be fixed with this ONNX PR.: onnx/onnx#5904.
Dubugging advice Converting TF model to onnx on s390x succeeds, but the resulting onnx file includes a large number 268632064 in Reshape operator. python3 -m tf2onnx.convert --opset 15 --fold_const --saved-model mnist_seqdnn --output mnist_seqdnn_s390x.onnx
The workaround is to run the tf2.convert on x86. The resulting onnx file then has the correct number 784 in Reshape operator. python3 -m tf2onnx.convert --opset 15 --fold_const --saved-model mnist_seqdnn --output mnist_seqdnn_x86.onnx
Describe the bug With all the same library versions and the same source model in TF. When converting to onnx on s390x processor architecture system, a constant in the Reshape operator is a large number 268632064. On a system with x86 processor the constant is as expected 784. s390x is a big endian system unlike x86. Typically the endianes is handled by the primitive operations and operators of either python or c++. Unless there is a direct interpretation of words/dwords in the code. The large number could be an indication of it.
Urgency This blocks the application of tf2onnx on s390x families of systems. Perhaps also powerpc64le, etc
System information
To Reproduce Use MNIST dataset to train and save a sequential DNN model. model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10) ]) ... # compile and fit. model.save("mnist_seqdnn", overwrite=True, include_optimizer=False, save_format='tf')
On s390x architecture machine python3 -m tf2onnx.convert --opset 15 --fold_const --saved-model mnist_seqdnn --output mnist_seqdnn_s390x.onnx
On s390x architecture machine python3 -m tf2onnx.convert --opset 15 --fold_const --saved-model mnist_seqdnn --output mnist_seqdnn_x86.onnx
Compare mnist_seqdnn_s390x.onnx and mnist_seqdnn_x86.onnx Reshape in x86 is const_fold_opt7 kind: Initializer type: int64[2] [ -1, 784 ] - OK Reshape in s390x is const_fold_opt7 kind: Initializer type: int64[2] [ -1, 268632064 ] - NG
Screenshots0![image](https://user-images.githubusercontent.com/21959540/161119457-f3cff441-c948-46b3-981e-a817fbed612e.png)
Additional context Console output on s390x system: