CNTK C# MaxUnpooling and ConvolutionTranspose examples

shilonosov commented 6 years ago

I am trying to build an antoencoder network in C# using CNTK. I need MaxUnpooling and ConvolutionTranspose operators in its decoder part.

I saw an example of autoencoder in Python here: https://docs.microsoft.com/en-us/cognitive-toolkit/Image-Auto-Encoder-Using-Deconvolution-And-Unpooling

Could anyone please provide an example of how to create unpooling and convolution transpose laysers in C#?

shilonosov commented 6 years ago

I was trying following for convolution transpose:

        const double ConvWScale = 0.26;
        var outShape = NDShape.CreateNDShape(new[] { outWidth, outHeight, outFeatureMapCount });

        var convolutionMap = new Parameter(
            new[] { outWidth, outHeight, outFeatureMapCount },
            DataType.Float,
            CNTKLib.GlorotUniformInitializer(ConvWScale, -1, 2),
            device,
            name);

       var result = CNTKLib.ConvolutionTranspose(
            convolutionMap,
            source,
            new[] { hStride, vStride, outFeatureMapCount },
            new BoolVector() { false },
            new BoolVector() { true },
            outShape);`

Given source's shape (7, 7, 64) outWidth = 14 outHeight = 14 outFeatureMapCount = 32 hStride = 2 vStride = 2

I've got following exception:

System.ApplicationException: 'Convolution transpose: The shape '[7 x 7 x 64]' of the convolution transpose operand 'Output('Unpooling111_Output_0', [7 x 7 x 64], [*, #])' is different than the resulting shape '[7 x 7 x 1]' from convolving the specified output shape '[14 x 14 x 32]' using the provided options.

What parameters should I use to get following convolution transpose: (7, 7, 64) -> (14, 14, 32)?

shilonosov commented 6 years ago

right. so, with this post help I came up with the following code:

        var sourceShape = NDShape.CreateNDShape(new [] {2, 2, 1});
        var convolutionMap = new Parameter(
            sourceShape,
            DataType.Float,
            CNTKLib.ConstantInitializer(1d),
            device,
            "conv transpose map");
        return CNTKLib.ConvolutionTranspose(convolutionMap, source, sourceShape);

YanYas commented 6 years ago

This function seems to have some caveats.

I can use it to upsample an image with one channel, for instance, 14x14x1 to 28x28x1.

I can also set the number of filters I want to create, such as 14x14x1 to 28x28x128.

But I can not ConvolveTranspose 3D Tensors that have more than 1 channel.

I am trying to convolve transpose from 14x14x8 to 28x28x8: Kernel: 4x4x1 Stride: 2x2x1 OutputShape: 28x28x8 Auto-padding: (True, True, False)

The module compiles showing that I have the correct shape (which I input) but I get different errors depending on which device I use.

GPU gives me:

cuDNN failure 8: CUDNN_STATUS_EXECUTION_FAILED ; GPU=0 ; hostname=LEO_SAYER ; expr=workspaceSizeFinder()

[CALL STACK]

Microsoft::MSR::CNTK::CudaTimer:: Stop

Microsoft::MSR::CNTK::CudaTimer:: Stop (x3)

std::enable_shared_from_this:: shared_from_this (x2)

CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD

std::enable_shared_from_this:: shared_from_this

CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD

CNTK::Function:: Forward

CNTK:: CreateTrainer

CNTK::Trainer:: TotalNumberOfUnitsSeen

CNTK::Trainer:: TrainMinibatch (x2)

CSharp_CNTK_TrainerTrainMinibatchSWIG_0

00007FFDA0368D77 (SymFromAddr() error: The specified module could not be found.)

While CPU gives me:

Mismatching outputShape and mapCount

[CALL STACK]

Microsoft::MSR::CNTK::Matrix:: Resize

Microsoft::MSR::CNTK::ConvolutionEngine:: Geometry

Microsoft::MSR::CNTK::ConvolutionEngine:: MaxUnpooling

Microsoft::MSR::CNTK::ConvolutionEngine:: BackwardData

std::enable_shared_from_this:: shared_from_this (x2)

CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD

std::enable_shared_from_this:: shared_from_this

CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD

CNTK::Function:: Forward

CNTK:: CreateTrainer

CNTK::Trainer:: TotalNumberOfUnitsSeen

CNTK::Trainer:: TrainMinibatch (x2)

CSharp_CNTK_TrainerTrainMinibatchSWIG_0

00007FFDA0368D77 (SymFromAddr() error: The specified module could not be found.)

An earlier error I came across showed up when attempting to train on CPU, trying to go from 7x7x2 to 7x7x16 (784 values)

New Error: GEMM convolution engine does not support this convolution configuration. It is possible to make GEMM engine work with this configuration by defining input/output/kernel using tensors of higher(+1) dimension. Geometry Input: 7 x 7 x 16, Output: 7 x 7 x 2, Kernel: 4 x 4 x 14, Map: 1 x 1 x 1, Stride: 1 x 1 x 2, Sharing: (1, 1, 1), AutoPad: (1, 1, 0), LowerPad: 0 x 0 x 0, UpperPad: 0 x 0 x 0

This was fixed, by using the GPU.

I suspect that If I can make the OutputShape and MapCount equal that'll fix it, but what is the MapCount?

BowenBao commented 6 years ago

Hi @YanYas, for the following example, could you try using Kernel: 4x4x8x8 ([w2, w1, O, I])? The first 8(O) is the channel axis, and should match the channel dimension of the output shape: 28x28x 8 . The second 8(I) is the map count, which should match the dimension of that in the input shape: 14x14x 8.

I am trying to convolve transpose from 14x14x8 to 28x28x8: Kernel: 4x4x1 Stride: 2x2x1 OutputShape: 28x28x8 Auto-padding: (True, True, False)

Edit: Updated the description on kernel channel and map count axis.

YanYas commented 6 years ago

Hi @BowenBao,

This works, but now if I try increasing the Stride from 2x2x1 to 2x2x2 I expect to get 28x28x16 but instead I get an error:

01:10:23 ERR : Exception occured in TMPluginWrapperNode.Evaluate 01:10:23 ERR : Convolution transpose: The shape '[14 x 14 x 8]' of the convolution transpose operand 'Output('Convolution5795297_Output_0', [14 x 14 x 8], [*, #])' is different than the resulting shape '[14 x 14 x 40]' from convolving the specified output shape '[28 x 28 x 16]' using the provided options.

[CALL STACK]

CNTK::NDMask:: MaskedCount

CNTK::NDMask:: MaskedCount

CNTK::Function:: ~Function

CNTK_ReleaseModel

RtlRunOnceExecuteOnce

InitOnceExecuteOnce

_crtInitOnceExecuteOnce

CNTK::Function:: InitOutputs

CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD

CNTK::Function:: ~Function

CNTK_ReleaseModel

RtlRunOnceExecuteOnce

InitOnceExecuteOnce

_crtInitOnceExecuteOnce

CNTK::Function:: InitOutputs

CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD

Do I need to change the size of the kernel channel? if so, is there a calculation I can apply for these changes?

BowenBao commented 6 years ago

Stride != 1 is not allowed for channel axis. If you want to use stride 2x2x2 and get 28x28x16, you have to treat that axis as spatial axis.

You can refer to convolution_transpose for more details.

FlorisDevreese commented 6 years ago

Hi,

I'm building an upsampling layer to change an image dimension from 320x320x3 to 639x639x3.

var convolutionMap = new Constant(new[] { 3, 3, 3, 3 }, DataType.Float, 1f);
var model = CNTKLib.ConvolutionTranspose(convolutionMap, input, new[] { 2, 2, 1 }, new BoolVector() { false }, new BoolVector() { true, true, false });;

When running model.Evaluate(...); I get the following error:

About to throw exception 'Convolution weight matrix Constant1 should have dimension [(filter shape) x (input channels) x (output channels)]' Validating --> Convolution2 = Convolution (Constant1, Input0) : [3 x 3 x 3 x 3], [320 x 320 x 3 x 2] -> [639 x 639 x 3 x 2] FAILED

I can't figure out how to set the arguments of CNTKLib.ConvolutionTranspose(...).

Any help? Thanks in advance!

microsoft / CNTK

CNTK C# MaxUnpooling and ConvolutionTranspose examples #2939