sonos / tract

Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference
Other
2.22k stars 214 forks source link

Some tensorflow extensions for keras layers support #182

Closed CharlieBickerton closed 2 years ago

CharlieBickerton commented 5 years ago

I'm trying to load a model into rust and I'm getting an error when I run the model.

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: 
TractError(Msg("Translating #30 \"global_average_pooling1d/Mean\" Unimplemented(Mean)"), 
State { next_error: Some(TractError(Msg("Operator can not be made a TypedOp."), 
State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), 
backtrace: InternalBacktrace { backtrace: None } })

It seems that the mean operation of global_average_pooling1d is not supported, does anyone know anymore about this?

kali commented 5 years ago

Hey, the missing operator name here is "Mean", "global_average_pooling1d" is the node name.

Is this a TF or Onnx network ? tract does have Mean in Onnx, but not in TF. Chances are it's easy enough to plug it in though. I can have a look... is there any chance you can share the model (in an untrained form if you prefer) with me ? I'm asking in case there are other ops that are missing.

CharlieBickerton commented 5 years ago

Hey @kali ,

Thanks for the quite response. Yes I'm using tensorflow. I'm building an training the model with Keras, then freezing the trained model as a .pb.

This is structure of the network:

    ip = tf.keras.layers.Input(shape=(num_timesteps, num_features), name="input_node")
    x = tf.keras.layers.Conv1D(128, 8, padding='same', kernel_initializer='he_uniform')(ip)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Activation('relu')(x)
    x = tf.keras.layers.GlobalAveragePooling1D()(x)
    output = Dense(2, activation='softmax', name="output_node")(y)

Do you mean me to share the frozen untrained model?

kali commented 5 years ago

Notes on Mean.

TF same interface than Max.

Onnx, ReduceXXX:

core.Reduce matches Onnx.

kali commented 5 years ago

@CharlieBickerton yeah, i would appreciate .pb . I'm not up to speed on training frameworks at all :)

CharlieBickerton commented 5 years ago

@kali Whats the best way to share the file with you? Github don't support the .pb upload

kali commented 5 years ago

@CharlieBickerton how about https://send.firefox.com/ ?

CharlieBickerton commented 5 years ago

https://send.firefox.com/download/96ec6b2c1e34a7e4/#0U9kTMpoRlXhwFDuy4VIBw @kali

kali commented 5 years ago

@CharlieBickerton Great! I'm on it :)

CharlieBickerton commented 5 years ago

Wicked thanks @kali !!! :)

kali commented 5 years ago

@CharlieBickerton I'm waiting on CI/benches before doing a relase, but feel free to give a try at the branch is you want.

CharlieBickerton commented 5 years ago

@kali I've just tried it and it works! Thank you so much for your help. I'm loving tract, thanks for open sourcing it!

CharlieBickerton commented 5 years ago

@kali I've been experimenting with my NN architecture and have come across another unsupported op.

TractError(Msg("Translating #47 \"dense/Tensordot/GatherV2\" Unimplemented(GatherV2)"), 
State { next_error: Some(TractError(Msg("Operator can not be made a TypedOp."),

What is your policy / capacity on adding new ops?

kali commented 5 years ago

Policy is vaguely defined. I covered a lot of Onnx operator set without needing them, but on TF side, things are different because the operator set is so big. Basically so far, for TF, it's "show me your network, I'll see if I can make it work".

Mean was a no brainer as it was very close to the one I had implemented in core to support Onnx. It's likely to be the same for GatherV2, I'll have a look.

There is stuff I don't want to include: 1/ research-y operators and features. An amazing number of ops in TF are only used by a handful of research papers and not in real production networks. And I clearly do not have the bandwitdth of tensorflow team. 2/ ops that break a few assumptions that I made and are deeply entrenched into tract core. The biggest assumption like that is that the size of all variable tensors can be determined from the model and the input shape. As a matter of fact, this prevent me from fully supporting Onnx.

Hopefully GatherV2 should be fine for 2/ and is definitely not a research operator but a basic building block, so I'm rather inclined to add it.

kali commented 5 years ago

TF GatherV2: 3 inputs: input, indices, axis (is it optional ?) Onnx: 2 inputs input and indices, axis is an attr tract-core matches Onnx.

If semantics are not crazy, reusing tract-core for TF GatherV2 should be relatively straightforward. Trivial is axis is not given even.

kali commented 5 years ago

@CharlieBickerton could you share a model ?

CharlieBickerton commented 5 years ago

Ok @kali thanks for the info, I agree it makes sense not to attempt to implement all of the TF ops given the scale.

I'll share my model, it is a lstm + cnn for timeseries classification. It's using fairly common ops so I think it makes sense to include the ops for future TF users of tract, unless they break rule 2.

Here's the model: https://send.firefox.com/download/403e8756fe583d55/#2fIL1k4vcePXwnRAE2YVMA

CharlieBickerton commented 5 years ago

@kali I accidently sent a 3 layer cnn rather than a lstm + cnn, but it was the one that made this error. I'm having trouble freezing the model with a lstm included.

On another note I would say that the tensorflow.keras.layers api's will contain the most used & stable tensorflow ops, these are now pushed as the default way to build networks in TF 2.0.

kali commented 5 years ago

Yeah, I agree that whatever keras.layers generates probably constitute a sensible subset of tensorflow for tract to target. I'll be having a look at your 3-cnn model this morning.

Note that LSTM support in tract is also in infancy, specifically on the TF side of things. We may need a few more iterations to get it to run :)

kali commented 5 years ago

GatherV2 implemented in #184

kali commented 5 years ago

@CharlieBickerton

Let's re-open this issue, as long as you're playing with tract and finding issues on a regular basis.

Do you need a release with GatherV2 or are you OK using a github dependency for a bit of time ?

CharlieBickerton commented 5 years ago

@kali good idea - I'll keep this thread updated with issues I come across.

No rush on the release atm, merge it in when you have time.

kali commented 5 years ago

@CharlieBickerton FYI, 0.5.6 is out

CharlieBickerton commented 4 years ago

@kali Do you know if other users of tract quantise / optimise their networks before using with tract? Using the Tf-lite quantisation options require transforming into a .tflite format which obviously isn't supported.

kali commented 4 years ago

Well, that’s a bit the problem with quantized networks today, it’s just a nice big mess. • I am working on some support for onnx quantized operators, and it will take a while. • we have a FakeQuantization operators in tensorflow that I think could help you emulate whatever tflite does. • I have no plan to accept tflite as an input format.

kali commented 2 years ago

Closing, no activity for a while.