microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.53k stars 4.28k forks source link

Trouble Loading in Python trained model into C# #3300

Closed oridayne closed 6 years ago

oridayne commented 6 years ago

I have trained a model in python, and am trying to load it into C#.

I do, Function.Load(path, device), and I printed the inputs and outputs. I noticed that there are actually two inputs, and 5 expected outputs, which really confused me, because the model I trained in Python should only take one input (a one_hot vector), and have one output( also a vector).

When I print various attributes of the model in python, I get: print(model.argument_map) Composite(Sequence[Tensor[1]]) -> Tensor[2] print(mode.arguments) (Input('Input1685923', [#, *], [1]),)

I'm really unfamiliar with the notation, but it seems that it's just a single variable with shape=1, which is what I defined it as earlier: X = C.sequence.input_variable(shape=(1))

As a result, I'm really confused why there are two inputs required for the C# loaded version, and why there are 5 required for the outputs.

I tried inputing and outputing variables into a dictionary and evaluating them in C#, but none of the numbers look remotely familiar to what I should be getting, and they are not even in the same expected format.

Does anyone have any tips loading in models into C# that are trained in python?

Tixxx commented 6 years ago

Hi @oridayne, could you share the model that you are building? I just tried loading a simple python model in c#, the inputs and outputs are expected: Python model: import cntk as C f = C.layers.BatchNormalization(map_rank=1) f.update_signature((3,480,640)) f

output of this line: Composite(BatchNormalization): Input('x', [#], [3 x 480 x 640]) -> Output('Block58_Output_0', [#], [3 x 480 x 640])

f.save("test.model") When i load test.model into c#, it shows 6 inputs and 1 output. The 6 inputs are because the batch normalization is defined to have default parameters like this:

def BatchNormalization(map_rank=default_override_or(None),
init_scale=1, normalization_time_constant=default_override_or(5000), blend_time_constant=0, epsilon=default_override_or(0.00001), use_cntk_engine=default_override_or(False), disable_regularization=default_override_or(False), name=''): Maybe you can check the function you are using and see if they contain default parameters?

oridayne commented 6 years ago

Hey Tix, thanks for replying, and that's a pretty good idea. The current model I was testing on was taken from someone else's notebook so I don't have a complete understanding of it.

I have actually tried to test this on another simpler python cntk model: https://github.com/Microsoft/CNTK/blob/v2.5.1/Tutorials/CNTK_101_LogisticRegression.ipynb

I prefer to show this model over mine because it's well documented, and exhibits similar characteristics to the problem I described above. However, I had trouble finding where these default parameters were set, but I did discover some newer things...

The inputs and outputs of the python model are: Composite(Tensor[2]) -> Tensor[2]. I take the output vector and do C.argmax([x,y]) -> which returns a 0 or 1 as my prediction result.

The C# implementation also shows 2 inputs, and 5 outputs.

I used the code below and instead just took the first input, and the first output `
model = Function.Load(modelPath, DeviceDescriptor.CPUDevice); List vect = vectorizeName(name); // convert to one-hot vector IList inputs = model.Arguments; IList output = model.Outputs;

    // specify input shape
    var inputDataMap = new Dictionary<Variable, Value>();
    var inputVal = Value.CreateSequence(inputs[0].Shape, vect, DeviceDescriptor.CPUDevice);
    inputDataMap.Add(inputs[0], inputVal);

    // specify output shape
    var outputDataMap = new Dictionary<Variable, Value>();
    outputDataMap.Add(output[0], null);

    model.Evaluate(inputDataMap, outputDataMap, DeviceDescriptor.CPUDevice);

    var outputVal = outputDataMap[output[0]];
    var outputData = outputVal.GetDenseData<float>(output[0]);
    var result = outputData[0];`

Result would return me a vector with 2 inputs. While the 2 inputs are different numbers than I would get from the python model, the argmax of the vector returns the expected result (tested 100 times). For example, the python model would return [-28.50, 2.567] while the C# model would return [-3.56, 3.01], but argument max of both is 1.

I haven't tried out using the two inputs and 5 outputs, but I'm not sure how to even use them in the eval functions. I'll try to search through the logistic regression model for what layer exactly defines the default paramters.

ke1337 commented 6 years ago

Please note that data in C++/C# are column major, while the data in Python is row-major.

If you are building from source, please comment this line in C# swig binding and use CNTKLib.SetComputationNetworkTraceLevel(1000000) to dump the output of nodes from C#, and compare with the dumps in Python using set_computation_network_trace_level(1000000).

oridayne commented 6 years ago

The C++C# column major issue was not the issue.

In the end my numbers were right, I just had to use CreateBatch instead of Create sequence. The model now evaluates correctly in C#. As long as I took only the first input and output from the C# model for evaluation, it works.

Thank you for commenting!