data61 / MP-SPDZ

Versatile framework for multi-party computation
Other
883 stars 274 forks source link

How to implement a non-sequntial model in mp-spdz? #1357

Closed winnylyc closed 3 months ago

winnylyc commented 3 months ago

Hello, sorry for disturbing you.

After reviewing the documentation, it appears that mp-spdz only supports models in sequential form. However, I would like to test some non-sequential models. Since implementing the training phase could require significant effort, I am primarily interested in implementing the inference phase.

One naive approach is treating each layer as an individual model and using the output of the previous model as the input for the current one. However, I anticipate a significant challenge with this approach, as the auto-optimization in MP-SPDZ, which is designed for a whole model, may not be applicable across the different models.

There are two main questions regarding the situation:

  1. Is there a possible way to implement the non-sequntial inference into one whole model.
  2. Does MP-SPDZ primarily optimize across the layers, or does the optimization mainly focus on each individual layer?

Looking forward to your responce! Thanks a lot!

mkskeller commented 3 months ago
  1. Yes, layer classes that take more than input have an inputs parameters such as Add or Concat: https://mp-spdz.readthedocs.io/en/latest/Compiler.html#Compiler.ml.Add
  2. Each layer is treated individually, so no optimizations across layers.
winnylyc commented 3 months ago

Great appreicate your help!

  1. I am trying using the layer you mentioned, but I encounter some difficulties. May you help me on this?

    from Compiler import ml
    X = sint.Tensor([10, 10])
    X.assign_all(1)
    Y = sint.Tensor([10, 10])
    Y.assign_all(1)
    dense = ml.SGD([ml.Add([ml.Dense(10, 10, 10), ml.Dense(10, 10, 10)]), ml.MultiOutput(10, 10)], n_epochs=1, report_loss=True)
    dense.layers[0].X = X
    dense.layers[1].Y = Y
    dense.reset()
    dense.run()

    Here is the error I got when I compile the code.

    ./compile.py -R 64 test_ML
    Default bit length for compilation: 63
    Default security parameter for compilation: 40
    Compiling file Programs/Source/test_ML.mpc
    Setting learning rate to 0.01
    Using SGD
    Traceback (most recent call last):
    File "/home/ylipf/MPCtest/mp-spdz/./compile.py", line 41, in <module>
    main(compiler)
    File "/home/ylipf/MPCtest/mp-spdz/./compile.py", line 36, in main
    compilation(compiler)
    File "/home/ylipf/MPCtest/mp-spdz/./compile.py", line 19, in compilation
    prog = compiler.compile_file()
    File "/home/ylipf/MPCtest/mp-spdz/Compiler/compilerLib.py", line 446, in compile_file
    exec(compile(infile.read(), infile.name, "exec"), self.VARS)
    File "Programs/Source/test_ML.mpc", line 10, in <module>
    dense.run()
    File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 200, in wrapper
    res = function(*args, **kwargs)
    File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 2446, in run
    N = self.layers[0].N
    AttributeError: 'Add' object has no attribute 'N'

    In addition, the layer classes you mentioned really helps me a lot. However, I want to confirm that if the two sequnces in parallel are exceed one layer, does it cannot be expressed in compiler.ml module? It seems that Add or Concat can only accept a list of layers as input. In other word, ml.Add([[ml.Dense(10, 10, 10), ml.Dense(10, 10, 10)], [ml.Dense(10, 10, 10), ml.Dense(10, 10, 10)]]) may not be accepted, since it use list of list of layers as input.

  2. If each layer is treated individually, may I regard that using each individual layer as a individual model may have the same performance as combining them into one whole model?

Thanks again for you help! And sorry for taking your time.

mkskeller commented 3 months ago

You will need to put all layers into the list given to SGD include the dense layers. The list also needs to preserve the order, so no later layer can be before an earlier one.

winnylyc commented 3 months ago

Thank you for your reply! I still have no idea about how to use ml.Add. May you show me more hint or give me a small example?

mkskeller commented 3 months ago

It would be something along the following lines:

dense1 = Dense(...)
dense2 = Dense(...)
add = Add(dense1, dense2)
SGD([..., dense1, ..., dense2, ..., add, ...], ...)
winnylyc commented 3 months ago

Thank you for patience! Based on your suggestion, I implement a simple model that works without error. However, it seems to not work as expected. To test on the support for non-sequntial layers, I want to implement a simple model as shown below.

Based on your suggestion, I implement the model as following.

from Compiler import ml
X = sfix.Tensor([1, 10])
X.assign_all(1)
Y = sfix.Tensor([1, 5])
Y.assign_all(1)
dense1 = ml.Dense(1, 10, 5)
dense2 = ml.Dense(1, 10, 5)
add = ml.Add([dense1, dense2])
model = ml.SGD([dense1, dense2, add, ml.MultiOutput(1, 5)], n_epochs=1, report_loss=True)
model.reset()

dense1_w = sfix.Tensor([10, 5])
dense1_w.assign_all(1)
model.layers[0].W = dense1_w

dense2_w = sfix.Tensor([10, 5])
dense2_w.assign_all(1)
model.layers[1].W = dense1_w

res = model.eval(X)
res.print_reveal_nested()

I set the input as all one matrix with shape [10], weight for both dense layer as all one matrix with shape [10, 5] and bias for for both dense layer as all zero matrix with shape [5]. I expect the output of both dense layer should be [10, 10, 10, 10 , 10], so the final output should be [20, 20, 20, 20, 20] from addition. However, the model does not show me the expected result.

Here are the output.

(python3_9) ./compile.py -R 64 test_ML
Default bit length for compilation: 63
Default security parameter for compilation: 40
Compiling file Programs/Source/test_ML.mpc
Setting learning rate to 0.01
Using SGD
Initializing dense weights in [-0.632456,0.632456]
Writing to Programs/Bytecode/test_ML-multithread-1.bc
Initializing dense weights in [-0.632456,0.632456]
Writing to Programs/Bytecode/test_ML-multithread-3.bc
Writing to Programs/Bytecode/test_ML-multithread-4.bc
Writing to Programs/Schedules/test_ML.sch
Writing to Programs/Bytecode/test_ML-0.bc
Hash: cf9f8377c015a0335c7a4cc1af2f57fc92a10b36614682e50edd955c4981b38d
Program requires at most:
         459 integer opens
       18673 integer bits
        2118 integer triples
           2 matrix multiplications (1x10 * 10x5)
          22 virtual machine rounds
(python3_9) ./Scripts/semi2k.sh test_ML
Running /home/ylipf/MPCtest/mp-spdz/Scripts/../semi2k-party.x 0 test_ML -pn 17476 -h localhost -N 2
Running /home/ylipf/MPCtest/mp-spdz/Scripts/../semi2k-party.x 1 test_ML -pn 17476 -h localhost -N 2
Using statistical security parameter 40
Trying to run 64-bit computation
Using SGD
dense X [[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]]
dense W [[1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]
dense b [0, 0, 0, 0, 0]
dense Y [[[10, 10, 10, 10, 10]]]
dense X [[[10, 10, 10, 10, 10, 0, 0, 0, 0, 0]]]
dense W [[1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]
dense b [0, 0, 0, 0, 0]
dense Y [[[50, 50, 50, 50, 50]]]
[[0.200012, 0.199997, 0.199997, 0.199997, 0.199997]]
Significant amount of unused triples of Z2^64 distorting the benchmark. For more accurate benchmarks, consider reducing the batch size with --batch-size.
Significant amount of unused dabits of Z2^64 distorting the benchmark. For more accurate benchmarks, consider reducing the batch size with --batch-size.
The following benchmarks are including preprocessing (offline phase).
Time = 0.0621324 seconds 
Data sent = 3.18327 MB in ~338 rounds (party 0 only; use '-v' for more details)
Global data sent = 6.60001 MB (all parties)
This program might benefit from some protocol options.
Consider adding the following at the beginning of your code:
        program.use_trunc_pr = True
        program.use_split(2)

It set the debug_output in layer class as True to see what happens. https://github.com/data61/MP-SPDZ/blob/177153531171b779df01244914b19a47be0a5763/Compiler/ml.py#L236 It seems that the two dense layers still work as sequntial. May you give me more hint to implement the non-sequntial model?

Great appreciate your help! Sorry for taking your time.

mkskeller commented 3 months ago

This isn't implemented because models used this far like ResNet would only have one layer taking the input, not two.

winnylyc commented 3 months ago

Again, Thank you for patience! Based on your suggestion, I have only one layer taking the input instead of two now. The model structure is shown as below

I implement the model as following.

from Compiler import ml
X = sfix.Tensor([1, 10])
X.assign_all(1)
Y = sfix.Tensor([1, 5])
Y.assign_all(1)
dense0 = ml.Dense(1, 10, 10)
dense1 = ml.Dense(1, 10, 5)
dense2 = ml.Dense(1, 10, 5)
add = ml.Add([dense1, dense2])
model = ml.SGD([dense0, dense1, dense2, add, ml.MultiOutput(1, 5)], n_epochs=1, report_loss=True)
model.reset()

dense0_w = sfix.Tensor([10, 10])
dense0_w.assign_all(1)
model.layers[0].W = dense0_w

dense1_w = sfix.Tensor([10, 5])
dense1_w.assign_all(1)
model.layers[1].W = dense1_w

dense2_w = sfix.Tensor([10, 5])
dense2_w.assign_all(1)
model.layers[2].W = dense1_w

res = model.eval(X)
res.print_reveal_nested()

However, I still did not get the correct answer.

(python3_9) ./Scripts/semi2k.sh test_ML
Running /home/ylipf/MPCtest/mp-spdz/Scripts/../semi2k-party.x 0 test_ML -pn 18858 -h localhost -N 2
Running /home/ylipf/MPCtest/mp-spdz/Scripts/../semi2k-party.x 1 test_ML -pn 18858 -h localhost -N 2
Using statistical security parameter 40
Trying to run 64-bit computation
Using SGD
dense X [[[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]]
dense W [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]
dense b [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
dense Y [[[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]]]
dense X [[[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]]]
dense W [[1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]
dense b [0, 0, 0, 0, 0]
dense Y [[[100, 100, 100, 100, 100]]]
dense X [[[100, 100, 100, 100, 100, 0, 0, 0, 0, 0]]]
dense W [[1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]
dense b [0, 0, 0, 0, 0]
dense Y [[[500, 500, 500, 500, 500]]]
[[0.199997, 0.199997, 0.199997, 0.199997, 0.199997]]
Significant amount of unused triples of Z2^64 distorting the benchmark. For more accurate benchmarks, consider reducing the batch size with --batch-size.
The following benchmarks are including preprocessing (offline phase).
Time = 0.0804187 seconds 
Data sent = 3.51812 MB in ~366 rounds (party 0 only; use '-v' for more details)
Global data sent = 7.35572 MB (all parties)
This program might benefit from some protocol options.
Consider adding the following at the beginning of your code:
        program.use_trunc_pr = True
        program.use_split(2)

I still did not find a proper way to use ml.Add. Is there any example on using it?

mkskeller commented 3 months ago

The missing piece is redirecting the input to dense2 because the default is to take output from the previous layer: dense2.inputs = [dense0]. Apologies for missing this earlier.

winnylyc commented 3 months ago

Great appreciate your help! Now I understand how to use ml.Add to implement non-sequntial models. Based on helpful suggestions, I have also tested out that MP-SPDZ can support a bit more complex structure, like the following.

The implementation is shown below.

from Compiler import ml
X = sfix.Tensor([1, 10])
X.assign_all(1)
Y = sfix.Tensor([1, 5])
Y.assign_all(1)
dense0 = ml.Dense(1, 10, 10)
dense1_1 = ml.Dense(1, 10, 7)
dense1_2 = ml.Dense(1, 7, 5)
dense2_1 = ml.Dense(1, 10, 7)
dense2_2 = ml.Dense(1, 7, 5)
dense2_1.inputs = [dense0]
add = ml.Add([dense1_2, dense2_2])
model = ml.SGD([dense0, dense1_1, dense1_2, dense2_1, dense2_2, add, ml.MultiOutput(1, 5)], n_epochs=1, report_loss=True)
model.reset()

dense0_w = sfix.Tensor([10, 10])
dense0_w.assign_all(1)
model.layers[0].W = dense0_w

dense1_1_w = sfix.Tensor([10, 7])
dense1_1_w.assign_all(1)
model.layers[1].W = dense1_1_w

dense1_2_w = sfix.Tensor([7, 5])
dense1_2_w.assign_all(1)
model.layers[2].W = dense1_2_w

dense2_1_w = sfix.Tensor([10, 7])
dense2_1_w.assign_all(1)
model.layers[3].W = dense2_1_w

dense2_2_w = sfix.Tensor([7, 5])
dense2_2_w.assign_all(1)
model.layers[4].W = dense2_2_w

res = model.eval(X)
res.print_reveal_nested()

I think MP-SPDZ can indeed implement most of the DNN structures.

Now there are a follow-up question. Sorry for taking your time.

How to use ml.Concat. I expect to use ml.Concat in the same way as ml.Add, but here comes a compilation error. For the following code, I just modify the above code by changing ml.Add to ml.Concat and changing the output dimension from 5 to 10.

from Compiler import ml
X = sfix.Tensor([1, 10])
X.assign_all(1)
Y = sfix.Tensor([1, 5])
Y.assign_all(1)
dense0 = ml.Dense(1, 10, 10)
dense1_1 = ml.Dense(1, 10, 7)
dense1_2 = ml.Dense(1, 7, 5)
dense2_1 = ml.Dense(1, 10, 7)
dense2_2 = ml.Dense(1, 7, 5)
dense2_1.inputs = [dense0]
# add = ml.Add([dense1_2, dense2_2])
concat = ml.Concat([dense1_2, dense2_2], dimension = 3)
model = ml.SGD([dense0, dense1_1, dense1_2, dense2_1, dense2_2, concat, ml.MultiOutput(1, 10)], n_epochs=1, report_loss=True)
model.reset()

dense0_w = sfix.Tensor([10, 10])
dense0_w.assign_all(1)
model.layers[0].W = dense0_w

dense1_1_w = sfix.Tensor([10, 7])
dense1_1_w.assign_all(1)
model.layers[1].W = dense1_1_w

dense1_2_w = sfix.Tensor([7, 5])
dense1_2_w.assign_all(1)
model.layers[2].W = dense1_2_w

dense2_1_w = sfix.Tensor([10, 7])
dense2_1_w.assign_all(1)
model.layers[3].W = dense2_1_w

dense2_2_w = sfix.Tensor([7, 5])
dense2_2_w.assign_all(1)
model.layers[4].W = dense2_2_w

res = model.eval(X)
res.print_reveal_nested()

Here is the compilation error.

(python3_9) ./compile.py -R 64 test_ML
Default bit length for compilation: 63
Default security parameter for compilation: 40
Compiling file Programs/Source/test_ML.mpc
Setting learning rate to 0.01
Using SGD
Initializing dense weights in [-0.547723,0.547723]
Writing to Programs/Bytecode/test_ML-multithread-1.bc
Initializing dense weights in [-0.594089,0.594089]
Writing to Programs/Bytecode/test_ML-multithread-3.bc
Initializing dense weights in [-0.707107,0.707107]
Writing to Programs/Bytecode/test_ML-multithread-5.bc
Initializing dense weights in [-0.594089,0.594089]
Writing to Programs/Bytecode/test_ML-multithread-7.bc
Initializing dense weights in [-0.707107,0.707107]
Writing to Programs/Bytecode/test_ML-multithread-8.bc
Traceback (most recent call last):
  File "/home/ylipf/MPCtest/mp-spdz/./compile.py", line 41, in <module>
    main(compiler)
  File "/home/ylipf/MPCtest/mp-spdz/./compile.py", line 36, in main
    compilation(compiler)
  File "/home/ylipf/MPCtest/mp-spdz/./compile.py", line 19, in compilation
    prog = compiler.compile_file()
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/compilerLib.py", line 446, in compile_file
    exec(compile(infile.read(), infile.name, "exec"), self.VARS)
  File "Programs/Source/test_ML.mpc", line 37, in <module>
    res = model.eval(X)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 200, in wrapper
    res = function(*args, **kwargs)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 2323, in eval
    self.run_in_batches(f, data, batch_size or len(self.layers[1].X))
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 2576, in run_in_batches
    def _(i):
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 704, in decorator
    range_loop(loop_body, start, stop, step)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 671, in range_loop
    while_loop(loop_fn, condition, start, g=loop_body.__globals__)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1264, in while_loop
    if_statement(pre_condition, lambda: do_while(loop_fn, g=g))
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1400, in if_statement
    if_fn()
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1264, in <lambda>
    if_statement(pre_condition, lambda: do_while(loop_fn, g=g))
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1325, in do_while
    condition = _run_and_link(loop_fn, g)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1292, in _run_and_link
    res = function()
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1258, in loop_fn
    result = loop_body(arg)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 661, in loop_fn
    res = loop_body(i)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 2578, in _
    f(start, batch_size, batch)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 2320, in f
    self.forward(batch=batch, run_last=False)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 200, in wrapper
    res = function(*args, **kwargs)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 2276, in forward
    layer.forward(batch=self.batch_for(layer, batch),
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 265, in forward
    self._forward(batch)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 1269, in _forward
    def _(i, j):
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1024, in <lambda>
    return lambda loop_body: new_dec(decorator(loop_body))
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1081, in decorator
    tape = prog.new_tape(f, (0,), 'multithread')
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/program.py", line 315, in new_tape
    function(*args)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1070, in f
    def f(i):
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 821, in decorator
    def f(i):
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 704, in decorator
    range_loop(loop_body, start, stop, step)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 671, in range_loop
    while_loop(loop_fn, condition, start, g=loop_body.__globals__)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1264, in while_loop
    if_statement(pre_condition, lambda: do_while(loop_fn, g=g))
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1400, in if_statement
    if_fn()
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1264, in <lambda>
    if_statement(pre_condition, lambda: do_while(loop_fn, g=g))
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1325, in do_while
    condition = _run_and_link(loop_fn, g)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1292, in _run_and_link
    res = function()
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1258, in loop_fn
    result = loop_body(arg)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 661, in loop_fn
    res = loop_body(i)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 827, in f
    state = reducer(tuplify(loop_body(j)), state)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1074, in f
    return loop_body(base + i)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/library.py", line 1021, in new_body
    return loop_body(*indices)
  File "/home/ylipf/MPCtest/mp-spdz/Compiler/ml.py", line 1271, in _
    self.Y[batch[0]][i][j].assign_vector(X[0][i][j].get_vector())
AttributeError: 'sfix' object has no attribute 'assign_vector'

May you give me some hints to use that.

mkskeller commented 3 months ago

The code assumes that the input has four dimensions (but doesn't check) because this is the case for the ImageNet networks like DenseNet or SqueezeNet.

winnylyc commented 3 months ago

Thank you for your hint! Now I can use the ml.Concat as expected.

from Compiler import ml
X = sfix.Tensor([1, 12, 12, 3])
X.assign_all(1)
Y = sfix.Tensor([1, 10])
Y.assign_all(1)
conv0 = ml.easyConv2d([1, 12, 12, 3], 1, 6, [2, 2], 2, 1)
conv1_1 = ml.easyConv2d([1, 6, 6, 6], 1, 9, [2, 2], 2, 1)
conv1_2 = ml.easyConv2d([1, 6, 6, 6], 1, 9, [2, 2], 2, 1)
conv1_2.inputs = [conv0]
concat = ml.Concat([conv1_1, conv1_2], dimension = 3)
dense1 = ml.Dense(1, 162, 10)
model = ml.SGD([conv0, dense1, conv1_1, conv1_2, concat, ml.MultiOutput(1, 10)], n_epochs=1, report_loss=True)
model.reset()

res = model.eval(X)
res.print_reveal_nested()

Great appreciate your patience!