Closed abhimanyuiris96 closed 2 years ago
I don't understand. @abhimanyuiris96 - please provide more context.
@TobyRoseman
If you see in the screenshot attached, the last layer is being called "SoftmaxND" by the latest coremltools. When I try to add a crossentropy layer in front of it, it errors out saying that CrossEntropy layer needs input from softmax and not softmaxND.
@TobyRoseman Any luck on this one? This bug has stopped all my progress. Or am I mistaken and there is a difference between "softmax" and "softmax_nd"?
softmax
and softmax_nd
are not the same. softmax
is applied along axis = -3 or N-3 (where N is the rank of the input). With softmax_nd you specify the axis.
But even after softmax_nd, we should be able to add a 'cross entropy' layer, right? To calculate the loss. This condition is preventing that.
But even after softmax_nd, we should be able to add a 'cross entropy' layer, right? To calculate the loss. This condition is preventing that.
I don't understand what you are trying to do. I'm going to need more context here to be helpful. Are you using the NeuralNetworkBuilder
class directly? Can you share the code you using?
I can't share the complete code but the gist of what I am doing is as follows:- I load the model, I make the last few layers updatable `spec = ct.utils.load_spec(mlmodel_path) builder = ct.models.neural_network.NeuralNetworkBuilder(spec=spec)
for n in num_layers[592:]: layer_type = ct.models.neural_network.builder._summarize_network_layer_info(n)[0] if layer_type in ['innerProduct', 'convolution']: name = ct.models.neural_network.builder._summarize_network_layer_info(n)[1] builder.make_updatable([name])
builder.set_categorical_cross_entropy_loss(name='lossLayer', input='classLabel') `
The last line is where it fails because as per the source code, it expects a softmax layer
and rejects anything else. In this case a softmax_nd
layer.
What's the exact error you get? Can you give me a stand alone example to reproduce the issue?
It's probably going to be easier to add a cross entropy layer to your PyTorch or TensorFlow model. Then reconvert to Core ML.
Since we have not received the requested information, I'm going to close this issue. If we get steps to reproduce the problem, I will reopen the issue.
Following the documentation at https://coremltools.readme.io/docs/updatable-neural-network-classifier-on-mnist-dataset mentions to use builder.set_categorical_cross_entropy_loss
in order to create a updatable model.
@TobyRoseman do you have an example of creating an updatable PyTorch model without calling set_categorical_cross_entropy_loss?
Will any layer as long as it's the final layer work (i.e. add the cross entropy loss from PyTorch)? What's the convention for the target labels? Can they just be a 'normal' input to the network? The builder seems to have special logic to distinguish loss layers from other layers (i.e. inspect_loss_layers
). I assume this logic wouldn't work out of the box when adding a PyTorch loss layer, would it?
@TobyRoseman people are trying to create updatable on-device classifiers.
Unfortunately there are no dog-fooding examples that uses anything besides https://ml-assets.apple.com/coreml/models/Image/DrawingClassification/UpdatableDrawingClassifier/UpdatableDrawingClassifier.mlmodel. You would think there would be an end to end example that starts from a vanilla ML Program exported from CreateML/Swift.
Are there any test cases/pseudo code making a CNN updatable that don't use MNIST dataset with Keras? What about an updatable model that comes from a CreateML and the apple toolchain? The example project is particularly annoying, because it just quietly ignores the entire workflow required for munging inputs and outputs around and then just magically include a CNN + KNN pipeline mlmodel.
[Id: 9], Name: dense_2__activation__ (Type: softmax)
Updatable: False
Input blobs: [u'dense_2_output']
Output blobs: [u'digitProbabilities']
[Id: 8], Name: dense_2 (Type: innerProduct)
Updatable: False
Input blobs: [u'dense_1__activation___output']
Output blobs: [u'dense_2_output']
[Id: 7], Name: dense_1__activation__ (Type: activation)
Updatable: False
Input blobs: [u'dense_1_output']
Output blobs: [u'dense_1__activation___output']
[Id: 143], Name: 564 (Type: softmaxND)
Updatable: False
Input blobs: ['x']
Output blobs: ['var_564']
[Id: 142], Name: x (Type: innerProduct)
Updatable: True
Input blobs: ['input.143']
Output blobs: ['x']
[Id: 141], Name: input.143 (Type: reshapeStatic)
Updatable: False
Input blobs: ['558']
Output blobs: ['input.143']
People are here because they aren't using MNIST and trying to bumble around
builder.set_categorical_cross_entropy_loss(name="lossLayer", input="var_564") Traceback (most recent call last): File "
", line 1, in File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/neural_network/builder.py", line 997, in set_categorical_cross_entropy_loss raise ValueError( ValueError: Categorical Cross Entropy loss layer input (var_564) must be a softmax layer output.
(ML) Fix softmax after converting CoreML model becomes softmaxND
https://developer.apple.com/documentation/coreml/personalizing-a-model-with-on-device-updates
https://forums.developer.apple.com/forums/thread/727757
https://www.netguru.com/blog/on-device-training-with-core-ml-make-your-pancakes-healthy-again
https://forums.developer.apple.com/forums/thread/709704
https://github.com/apple/coremltools/issues/1714
https://stackoverflow.com/questions/56966477/ios-core-ml-updatetable-model-on-device-learning
https://github.com/apple/coremltools/blob/e3032bf28a5b46e398f1f99b9d4ae9e57a08ffdc/coremltools/models/neural_network/builder.py#L982
The recent versions of coremltools seem to reference a softmax layer as softmaxND, at least in the PyTorch converted models. Thus this condition should be updated as it makes adding a loss layer easier, instead of fancy workarounds.