BidAF weird shapes around Compress operator

kali commented 5 years ago

First of all, thanks @KeDengMS for the work on the BidAF model, it is a great addition to the zoo-as-a-test-suite.

I'm struggling a bit trying to get it through tract. I think there may be an encoding error that some backend implementation chose to ignore silently at the very end. Unless I'm missing something, of course.

248 PermuteAxes Transpose_22
  * input fact  #0: 245/0> 87x1xF32
  * output fact #0: 1x87xF32
249 LayerHardmax Hardmax_23
  * input fact  #0: 248/0> 1x87xF32
  * output fact #0: 1x87xF32
250 Cast Cast_24
  * input fact  #0: 249/0> 1x87xF32
  * output fact #0: 1x87xBool
  * Attr to: name: "to" type: INT i: 9
253 onnx.Compress Compress_27
  * input fact  #0: 252/0> 1x87xI32
    input fact  #1: 250/0> 1x87xBool
  * output fact #0: 1xI32 MODEL OUTPUT

According to its specification, onnx compress second input is supposed to be of rank 1, and I think we get a 2D input here. It comes from a Transpose -> HardMax -> Cast sequence.

As far as I can tell HardMax is not supposed to change the shape either, and Transpose definitely hints at a 2D output, so...

Any help appreciated.

ke1337 commented 5 years ago

Glad the model would help you. Hardmax creates a one-hot tensor, which means in the length of 87, all elements are zero except for a 1 at the max value position. Then [Compress]() would use the hardmax output as condition, and extract the value from input at the max value position. Because of combination of Hardmax and Compress, the output dimension of Compress is 1.

kali commented 5 years ago

My problem here is with the "condition" input of Compress. I would be expecting a rank-1 boolean input, per spec here: https://github.com/onnx/onnx/blob/master/docs/Operators.md#inputs-17

ke1337 commented 5 years ago

You are right, condition of Compress needs to be 1D. There's a missing reshape for the condition. I've updated the model.

kali commented 5 years ago

Thanks for the quick fix !

kali commented 5 years ago

Sorry to re-open this, I think the updated model may have another issue:

(ort) TSAR 12/06 13:20 ~/dev/snips/tract/harness/onnx-test-suite/debug-utils% python              bidafU
Python 3.7.3rc1 (default, Mar 13 2019, 11:01:15)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import onnxruntime as rt;
>>> rt.InferenceSession("bidaf/bidaf.onnx")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kali/dev/snips/tract/harness/onnx-test-suite/debug-utils/ort/lib/python3.7/site-packages/onnxruntime/capi/session.py", line 29, in __init__
    self._sess.load_model(path_or_bytes)
RuntimeError: [ONNXRuntimeError] : 1 : GENERAL ERROR : The node is not placed on any Execution Provider

On the other hand the Compress condition shape issue is fixed.

ke1337 commented 5 years ago

Thanks for testing it. I used a new converter script that generates Where op with int32, which is supported in onnxruntime 0.4.0. The model has been updated to use Where op with float and verified with onnxruntime 0.4.0.

benschreiber commented 5 years ago

I am having issues with Bidaf now. I have tried running the provided test data against onnxruntime 0.4.0, but it fails with

RuntimeError: Method run failed due to: [ONNXRuntimeError] : 1 : GENERAL ERROR : Non-zero status code returned while running Node: Convolution10253 Status Message: X num_dims does not match W num_dims. X: {10,8} W: {100,1,5,8}

ke1337 commented 5 years ago

@benschreiber please check If you data feed is in the right shape:

context_word: [seq, 1,] of string context_char: [seq, 1, 1, 16] of string query_word: [seq, 1,] of string query_char: [seq, 1, 1, 16] of string

benschreiber commented 5 years ago

That is not what I see in the provided test data. All are of the shape:

context_word (46, 1) context_char (10, 1) query_word (46, 1, 1, 16) query_char (10, 1, 1, 16)

with different values replacing 46 and 10. Is my script wrong, or is the provided data wrong?

ke1337 commented 5 years ago

Please check README.md. context_word and context_char should have the same sequence length, which in your data is supposed to be 46. query_word and query_char should have length of 10.

Here's an example of using this model in onnxruntime:

import sys
import numpy as np
from nltk import word_tokenize
import onnxruntime

def preprocess(text):
   tokens = word_tokenize(text)
   # split into lower-case word tokens, in numpy array with shape of (seq, 1)
   words = np.asarray([w.lower() for w in tokens]).reshape(-1, 1)
   # split words into chars, in numpy array with shape of (seq, 1, 1, 16)
   chars = [[c for c in t][:16] for t in tokens]
   chars = [cs+['']*(16-len(cs)) for cs in chars]
   chars = np.asarray(chars).reshape(-1, 1, 1, 16)
   return words, chars

# input
context = 'A quick brown fox jumps over the lazy dog.'
query = 'What color is the fox?'
cw, cc = preprocess(context)
qw, qc = preprocess(query)

sess = onnxruntime.InferenceSession('bidaf.onnx')
answer = sess.run([], {'context_word':cw, 'context_char':cc, 'query_word':qw, 'query_char':qc})

# assuming answer contains the np arrays for start_pos/end_pos
start = np.asscalar(answer[0])
end = np.asscalar(answer[1])
print([w.encode() for w in cw[start:end+1].reshape(-1)])

benschreiber commented 5 years ago

I know what the data should look like. But the provided test data on that same page (under the "Download (with sample test data)") link does not match. Please update the provided test data.

ke1337 commented 5 years ago

I see, you mean the input_0/1/2/3 is not corresponding to the input index in the model. The test data is in the form of onnx.TensorProto so it uses name to match model, instead of ordinal. The code below should work:

import onnxruntime
import onnx
from onnx import numpy_helper
sess = onnxruntime.InferenceSession('bidaf.onnx')
test_data_dir = 'test_data_set_1'
inputs = [onnx.load_tensor(test_data_dir + '/input_' + str(n) + '.pb') for n in range(4)]
feed = dict([(i.name, numpy_helper.to_array(i)) for i in inputs])
sess.run([], feed)

benschreiber commented 5 years ago

Yes, that appears to work. Thank you. Please consider updating the instructions here.

onnx / models

BidAF weird shapes around Compress operator #170