apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.45k stars 645 forks source link

if operation created by "add_branch()" cannot give right result if else_branch is called the first time #2391

Closed xueyingxin closed 1 week ago

xueyingxin commented 1 week ago

🐞Describing the bug

To Reproduce

from coremltools.models import datatypes, MLModel from coremltools.models.neural_network import NeuralNetworkBuilder

import numpy as np

""" Test a simple if-else branch network """

input_features = [("data", datatypes.Array(3)), ("cond", datatypes.Array(1))] output_features = [("output", None)]

builder_top = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) layer = builder_top.add_branch("branch_layer", "cond")

builder_ifbranch = NeuralNetworkBuilder( input_features=None, output_features=None, spec=None, nn_spec=layer.branch.ifBranch, ) builder_ifbranch.add_elementwise( "mult_layer", input_names=["data"], output_name="output", mode="MULTIPLY", alpha=10, ) builder_elsebranch = NeuralNetworkBuilder( input_features=None, output_features=None, spec=None, nn_spec=layer.branch.elseBranch, ) builder_elsebranch.add_elementwise( "add_layer", input_names=["data"], output_name="output", mode="ADD", alpha=10, )

mlmodel = MLModel(builder_top.spec)

True branch case

input_dict = { "data": np.array(range(1, 4), dtype="float"), "cond": np.array([0], dtype="float"), } preds = mlmodel.predict(input_dict) print(preds) print("------")

input_dict = { "data": np.array(range(1, 4), dtype="float"), "cond": np.array([1], dtype="float"), } preds = mlmodel.predict(input_dict) print(preds) print("------")

It is simple that the if_branch should do *10, and the else_branch should do +10.
But the result on my side is:

{'output': array([10., 10., 10.])}

{'output': array([10., 20., 30.])}

Don't know how does that come from....
But if I set "cond" to "1", then "0", the result is as expected:

{'output': array([10., 20., 30.])}

{'output': array([11., 12., 13.])}


- If the model conversion succeeds, but there is a numerical mismatch in predictions, please include the code used for comparisons.

## System environment (please complete the following information):
 - coremltools version: 7.2
 - OS (e.g. MacOS version or Linux type): Sonoma 14.6.1
 - Any other relevant version information (e.g. PyTorch or TensorFlow version): torch 1.13.1,  python 3.10.13

## Additional context
- Add anything else about the problem here that you want to share.
TobyRoseman commented 1 week ago

Looking at your code, in the first mlmodel.predict call, cond is set to [0]. So the else-branch should be taken. The else-branch adds 10.

In the second mlmodel.predict call, cond is set to [1]. So the if-branch should be taken. The if-branch multiplies by 10.

So to clarify, the correct output would actually be:

{'output': array([11., 12., 13.])}
------
{'output': array([10., 20., 30.])}
------

(You swapped those two lines in your expected output).

Using the tip of main and MacOS 15, that's exactly the results I'm getting.

Please try using the most recent version of coremltools.

xueyingxin commented 1 week ago

@TobyRoseman, please don't easily close my issue. I mean when I set "cond" to 0(+10), then 1(*10), the outcome should be: {'output': array([11., 12., 13.])} {'output': array([10., 20., 30.])}

however the my result was: {'output': array([10., 10., 10.])} {'output': array([10., 20., 30.])}

Can you see the difference? So, I don't know what's going wrong here. And I have updated coremltools to latest version 8.0, still the problem.

TobyRoseman commented 1 week ago

With coremltools 8.0 and MacOS 15, I get the following when running your code:

{'output': array([11., 12., 13.])}
------
{'output': array([10., 20., 30.])}

It sounds like you now agree that is the correct answer.

If you're getting a different answer with coremltools 8.0, then this has to be a bug in the Core ML Framework (i.e. part of the Operating System). You said you were using macOS 14.6.1. I'm using macOS 15.0. So it's possible this was a Framework bug that was fixed between macOS 14.6.1 and macOS 15.0. In which case you should update your macOS.

xueyingxin commented 1 week ago

I updated macOS to 15.1, and still the wrong result. But since you had the same code run and got right result. I assume there must be something wrong on my side. Thanks for replying.