sonos / tract

Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference
Other
2.18k stars 210 forks source link

Unable to run MobileBert QA #1288

Closed rockwotj closed 8 months ago

rockwotj commented 8 months ago

Hello! I'm attempt to run the tflite model for MobileBert to do QA in tract (ultimately run it in WebAssembly). However, I've hit a snag trying to run the model and get the same outputs as tflite directly.

Here is the code: https://github.com/rockwotj/tract-tests I've ported the tflite example for bert QA from swift to rust and used tract to run the models.

The tflite model doesn't work, I get the following error:

Error: Translating proto model to model

Caused by:
    0: Parsing op Operator {
           opcode_index: 11,
           inputs: Some(
               [
                   5423,
               ],
           ),
           outputs: Some(
               [
                   1,
               ],
           ),
           builtin_options_type: CastOptions,
           builtin_options: CastOptions {
               in_data_type: INT32,
               out_data_type: FLOAT32,
           },
           custom_options: None,
           custom_options_format: FLEXBUFFERS,
           mutating_variable_inputs: Some(
               [],
           ),
           intermediates: None,
       }
    1: Unsupported: OperatorCode {
           deprecated_builtin_code: 53,
           custom_code: None,
           version: 1,
           builtin_code: ADD,
       }, inputs: [
           1,384,I32,
       ]

Okay, so I convert it to ONNX to see, that works! However, the inputs to the model are the same (I've debugged the tflite code and showed the tensor inputs to tflite and tract converted onnx model are the same), but there are different outputs, and the answers the model tract ran gives back is non-sense. I tried running a different underlying engine (ort) and got the same results, (tract and ort agree on the behavior tflite gives something else). So I assume it's the tflite->onnx conversion that is doing something wrong...

Is there a recommended way to achieve what I'm doing? Or a different tool to convert tflite into another format? Thanks in advance for any help!

rockwotj commented 8 months ago

I was able to convert mobile bert to nnef using https://github.com/KhronosGroup/NNEF-Tools/tree/main/nnef_tools but get the following error:

Error: In ModelBuilder::translate

Caused by:
    0: Wiring root graph body
    1: Plugging in assignement for "cast1"
    2: Resolving invocation Identifier("cast")
    3: No definition for operator Identifier("cast")
kali commented 8 months ago

Hey man, can you please link the ONNX model for me and link it somewhere ? It would also help if you can prep some input in .npz form (See https://github.com/sonos/tract/blob/main/doc/cli-recipe.md#running-a-test-case for... inspiration).

We'll get to the tflite version in time, I guess. But ONNX should be easier, support in tract is much more mature.

rockwotj commented 8 months ago

I have steps to create the model in the README: https://github.com/rockwotj/tract-tests

But have uploaded the onnx (converted from the steps in the README) and tflite (from tfhub) in a release here: https://github.com/rockwotj/tract-tests/releases

I'm suspecting it's more of a problem in converting tflite -> onnx as the onnx runtime itself (via ort) was giving me the same results as tract.

Is there a way to create an .npz file from Rust? I don't really mess much with python, but running main in that repo gives some sample inputs and prints the outputs if that's helpful.

rockwotj commented 8 months ago

Actually I am getting better results using the ONNX model here: https://huggingface.co/csarron/mobilebert-uncased-squad-v2/tree/main, over the converted one from tfhub.

rockwotj commented 8 months ago

So I don't think this is an issue with tract, but just the converted model. I'm going to close this.