Unity-Technologies / barracuda-release

Other
563 stars 77 forks source link

Support for Conv1D #43

Closed scode-cmd closed 4 years ago

scode-cmd commented 4 years ago

Hi!

We were wondering if the latest version supports 1D convolutions. We couldn't info about this type of layer in the docs. Conv1D layers are very useful for sequence / temporal data, i.e audio, 3d character animations, text sequences etc.

scode-cmd commented 4 years ago

We are using Keras for development.

AlexRibard commented 4 years ago

Hi! Conv1D is supported, only 3D convolutions are not. As long as the model gets exported correctly to onnx it should be good :)

scode-cmd commented 4 years ago

Awesome, thanks! Perhaps this can be added to the "Approximate list of TensorFlow nodes supported by Barracuda script converter" section in the docs?

AlexRibard commented 4 years ago

Yes it will, we are currently working on improving the docs. Thanks for the feedback!

AlexRibard commented 4 years ago

Some clarifications about my message. If you are converting your keras model using the keras_to_barracuda.py or tensorflow_to_barracuda.py, it is indeed not supported. However, exporting your model to a .onnx file (keras2onnx) should work. https://github.com/onnx/keras-onnx

scode-cmd commented 4 years ago

Why not recommend everyone to use the keras2onnx instead of creating another seprate converter inside Barracuda? Or is there a specific need / advantage for that?

AlexRibard commented 4 years ago

Legacy reasons both internal and from the time we didn't support ONNX. Going forward, we recommend converting models in .onnx (with tf2onnx or keras2onnx) and importing these in Unity. But I recognize the confusion, docs will be updated.

scode-cmd commented 4 years ago

Ok, thanks for the clarification!

marns commented 4 years ago

Hi @AlexRibard, can you clarify availability/usage of Conv1d support? Using the release/1.0.0 branch, my ONNX Conv1D layers are being recognized as Conv2D (which then fail to import properly). I'm not seeing explicit Conv1D types in Model.cs or elsewhere.

AlexRibard commented 4 years ago

Hi @marns Even if we don't have a specific code path for Conv1D, I assumed they would correctly import into Conv2D (with width = 1) But I recently realized we have some import errors when parsing input Conv1D. They will be addressed. In the meantime, I would recommend to write out your Conv1D as a Conv2D (with width = 1)

Sorry for the confusion

marns commented 4 years ago

Thanks @AlexRibard - FYI to get this to work, I also had to swap the stride dimensions in the ONNX importer. It's listed as (width, height) but I think it should be (height, width) to match the other kernel dimensions. My model is exported from PyTorch.

AlexRibard commented 4 years ago

I pushed support for it on master. It should be available in the upcoming release. @marns do you have specific code snippet to illustrate your point?

marns commented 4 years ago

@AlexRibard Here's the hack I used to swap stride dimensions, which works for my (4,1) and (2,1) kernels/strides.

I also tried swapping at runtime instead, which gave correct output dimensions but not correct results, so probably I missed something.

+++ b/Barracuda/Runtime/Core/ModelBuilder.cs
@@ -293,6 +293,9 @@ namespace Unity.Barracuda
         /// </summary>
         public Layer Conv2D(string name, object input, Int32[] stride, Int32[] pad, Tensor kernel, Tensor bias)
         {
+           int temp = stride[0];
+           stride[0] = stride[1];
+           stride[1] = temp;           
             return Conv(name, Layer.Type.Conv2D, input, stride, pad, new int[0], kernel, bias);
         }

@@ -368,6 +371,9 @@ namespace Unity.Barracuda
         /// </summary>
         public Layer MaxPool2D(string name, object input, Int32[] pool, Int32[] stride, Int32[] pad)
         {
+           int temp = stride[0];
+           stride[0] = stride[1];
+           stride[1] = temp;
             return Pool(Layer.Type.MaxPool2D, name, input, pool, stride, pad);
         }
AlexRibard commented 4 years ago

Here is the code in trunk

ONNXModelImporter.cs

Add("Conv", (net, node)     => {
                int[] dilations = new[] { 1, 1 }; // @TODO trap on wrong values
                int[] strides = node.Strides;
                int[] pads = node.Pads;

                node.IgnoredAttribute("kernel_shape", "Kernel shape is derived from K tensor weights instead");
                var kernels = node.Input1Constant(onnxLayout: "KCHW", name: "W");

                var kernelRank = node.Input1Rank;
                if (kernelRank == 3) // Conv1D
                {
                    dilations = node.DilatationsOptional(new[] { 1 }); // @TODO trap on wrong values
                    Debug.Assert(dilations.Length == 1);
                    dilations = new[] { dilations[0], 1 };

                    if (strides.Length == 1)
                        strides = new[] { strides[0], 1 };

                    if (pads.Length == 2)
                        pads = new[] { pads[0], 0, pads[1], 0 };
                }
                else if (kernelRank == 4) // Conv2D
                {
                    dilations = node.DilatationsOptional(new[] { 1, 1 });
                    Debug.Assert(dilations.Length == 2);
                }
                else
                {
                    Warn(net, node, $"Unsuported Conv kernel rank. Conv2D assumes rank 4, Conv1D assumes rank 3, but got {kernelRank}.");
                }

                Debug.Assert(dilations.Length == 2);
                if (dilations[0] != 1 || dilations[1] != 1)
                    kernels = DilateKernel(kernels, dilations); // @TODO inefficient method. Support dilatation in kernel code properly

                var biases = node.Input2ConstantOptional(Bias(kernels.shape), 0.0f, onnxLayout: "C", name: "B");

                if (node.GroupOptional() > 1)
                    net.DepthwiseConv2D(node.Name, node.Input0, strides, pads, kernels, biases);
                else
                    net.Conv2D(node.Name, node.Input0, strides, pads, kernels, biases);

                Output(node, features: kernels.channels);
});

ONNXNodeWrapper.cs

        public int Input1Rank { get { return m_ONNXModelTensors.variables[Input1].rank; } }
marns commented 4 years ago

Thanks for sharing, looking forward to the full Conv1d support.

Just to clarify, the issue I'm describing is with a Conv2d where e.g. Stride height=4, width=1 is treated as width=1, height=4 which produces wrong output sizes/results