apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.44k stars 641 forks source link

Add an extra condition for softmaxND #1417

Closed abhimanyuiris96 closed 2 years ago

abhimanyuiris96 commented 2 years ago

https://github.com/apple/coremltools/blob/e3032bf28a5b46e398f1f99b9d4ae9e57a08ffdc/coremltools/models/neural_network/builder.py#L982

The recent versions of coremltools seem to reference a softmax layer as softmaxND, at least in the PyTorch converted models. Thus this condition should be updated as it makes adding a loss layer easier, instead of fancy workarounds.

TobyRoseman commented 2 years ago

I don't understand. @abhimanyuiris96 - please provide more context.

abhimanyuiris96 commented 2 years ago

@TobyRoseman

If you see in the screenshot attached, the last layer is being called "SoftmaxND" by the latest coremltools. When I try to add a crossentropy layer in front of it, it errors out saying that CrossEntropy layer needs input from softmax and not softmaxND.

Screen Shot 2022-03-03 at 8 31 48 PM
abhimanyuiris96 commented 2 years ago

@TobyRoseman Any luck on this one? This bug has stopped all my progress. Or am I mistaken and there is a difference between "softmax" and "softmax_nd"?

TobyRoseman commented 2 years ago

softmax and softmax_nd are not the same. softmax is applied along axis = -3 or N-3 (where N is the rank of the input). With softmax_nd you specify the axis.

abhimanyuiris96 commented 2 years ago

But even after softmax_nd, we should be able to add a 'cross entropy' layer, right? To calculate the loss. This condition is preventing that.

TobyRoseman commented 2 years ago

But even after softmax_nd, we should be able to add a 'cross entropy' layer, right? To calculate the loss. This condition is preventing that.

I don't understand what you are trying to do. I'm going to need more context here to be helpful. Are you using the NeuralNetworkBuilder class directly? Can you share the code you using?

abhimanyuiris96 commented 2 years ago

I can't share the complete code but the gist of what I am doing is as follows:- I load the model, I make the last few layers updatable `spec = ct.utils.load_spec(mlmodel_path) builder = ct.models.neural_network.NeuralNetworkBuilder(spec=spec)

for n in num_layers[592:]: layer_type = ct.models.neural_network.builder._summarize_network_layer_info(n)[0] if layer_type in ['innerProduct', 'convolution']: name = ct.models.neural_network.builder._summarize_network_layer_info(n)[1] builder.make_updatable([name])

builder.set_categorical_cross_entropy_loss(name='lossLayer', input='classLabel') `

The last line is where it fails because as per the source code, it expects a softmax layer and rejects anything else. In this case a softmax_nd layer.

TobyRoseman commented 2 years ago

What's the exact error you get? Can you give me a stand alone example to reproduce the issue?

It's probably going to be easier to add a cross entropy layer to your PyTorch or TensorFlow model. Then reconvert to Core ML.

TobyRoseman commented 2 years ago

Since we have not received the requested information, I'm going to close this issue. If we get steps to reproduce the problem, I will reopen the issue.

tdomhan commented 1 year ago

Following the documentation at https://coremltools.readme.io/docs/updatable-neural-network-classifier-on-mnist-dataset mentions to use builder.set_categorical_cross_entropy_loss in order to create a updatable model. @TobyRoseman do you have an example of creating an updatable PyTorch model without calling set_categorical_cross_entropy_loss?

Will any layer as long as it's the final layer work (i.e. add the cross entropy loss from PyTorch)? What's the convention for the target labels? Can they just be a 'normal' input to the network? The builder seems to have special logic to distinguish loss layers from other layers (i.e. inspect_loss_layers). I assume this logic wouldn't work out of the box when adding a PyTorch loss layer, would it?

rromanchuk commented 1 week ago

@TobyRoseman people are trying to create updatable on-device classifiers.

Unfortunately there are no dog-fooding examples that uses anything besides https://ml-assets.apple.com/coreml/models/Image/DrawingClassification/UpdatableDrawingClassifier/UpdatableDrawingClassifier.mlmodel. You would think there would be an end to end example that starts from a vanilla ML Program exported from CreateML/Swift.

Are there any test cases/pseudo code making a CNN updatable that don't use MNIST dataset with Keras? What about an updatable model that comes from a CreateML and the apple toolchain? The example project is particularly annoying, because it just quietly ignores the entire workflow required for munging inputs and outputs around and then just magically include a CNN + KNN pipeline mlmodel.

[Id: 9], Name: dense_2__activation__ (Type: softmax)
          Updatable: False
          Input blobs: [u'dense_2_output']
          Output blobs: [u'digitProbabilities']
[Id: 8], Name: dense_2 (Type: innerProduct)
          Updatable: False
          Input blobs: [u'dense_1__activation___output']
          Output blobs: [u'dense_2_output']
[Id: 7], Name: dense_1__activation__ (Type: activation)
          Updatable: False
          Input blobs: [u'dense_1_output']
          Output blobs: [u'dense_1__activation___output']
[Id: 143], Name: 564 (Type: softmaxND)
          Updatable: False
          Input blobs: ['x']
          Output blobs: ['var_564']
[Id: 142], Name: x (Type: innerProduct)
          Updatable: True
          Input blobs: ['input.143']
          Output blobs: ['x']
[Id: 141], Name: input.143 (Type: reshapeStatic)
          Updatable: False
          Input blobs: ['558']
          Output blobs: ['input.143']

https://apple.github.io/coremltools/docs-guides/source/updatable-neural-network-classifier-on-mnist-dataset.html

People are here because they aren't using MNIST and trying to bumble around

builder.set_categorical_cross_entropy_loss(name="lossLayer", input="var_564") Traceback (most recent call last): File "", line 1, in File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/neural_network/builder.py", line 997, in set_categorical_cross_entropy_loss raise ValueError( ValueError: Categorical Cross Entropy loss layer input (var_564) must be a softmax layer output.

(ML) Fix softmax after converting CoreML model becomes softmaxND

https://developer.apple.com/documentation/coreml/personalizing-a-model-with-on-device-updates

https://forums.developer.apple.com/forums/thread/727757

https://www.netguru.com/blog/on-device-training-with-core-ml-make-your-pancakes-healthy-again

https://forums.developer.apple.com/forums/thread/709704

https://github.com/apple/coremltools/issues/1714

https://stackoverflow.com/questions/56966477/ios-core-ml-updatetable-model-on-device-learning