iulianflst commented 2 years ago

❓Question

I just converted my tensorflow model using coremltools and all the predictions are wrong, here is some info: Screen Shot 2022-09-09 at 11 22 38 AM

model input Image (Color 192 × 192)
model output string
labels provided
conversion made on mac
tensorflow 2.8 although I tested and 2.2, 2.6, 2.6.5 and other versions
coremltools - 6.0b2
meta description: author, license, short_description don't work, nothing is saved (I didn't waste any time debugging this, the most important one is predictions)

for prediction I used exactly the same image I used in python for testing but coreml doesn't even get close to prediction


mlmodel = ct.convert(model
                 , convert_to="mlprogram",
                 inputs=[ct.ImageType(
                     name="sequential_1_input",
                        shape=(1, 192, 192, 3),
                        scale=1.0 / 255.0,
                        # bias=[-10,-10,-10], ## only way to get values between 0 - 100% else I get something like -2.02
                        color_layout="RGB" # with or without this, Is the same
                 )],
                 # outputs=[ct.ImageType()],
                 source="tensorflow",
                 classifier_config=ct.ClassifierConfig(class_labels)
                 )

Set model author name

mlmodel.author = 'My author name'

Set the license of the model

mlmodel.license = "some URL"

Set a short description for the Xcode UI

mlmodel.short_description = "some description"

Set a version for the model

mlmodel.version = "1.0" mlmodel.save("mymodel.mlpackage")



I'm also getting the  error:
` UnboundLocalError: local variable '_shutil' referenced before assignment `

But because this is for clean up I can ignore it.

I've been trying to make this work for more than  1 week, and I can't make it work, I have the impression that there is something that I'm doing wrong
![Screen Shot 2022-09-09 at 11 29 49 AM](https://user-images.githubusercontent.com/1937701/189307497-83e40c98-0d79-46fd-92c0-c50464ddc6ff.png)

 I also opened a discussion on StackOverflow here:
https://stackoverflow.com/questions/73639510/after-converting-tensorflow-model-to-coreml-the-model-doesnt-predict-correct

and here

https://developer.apple.com/forums/thread/714096

TobyRoseman commented 2 years ago

Don't worry about the _shutil issue. That can safely be ignored and will be fixed in our next release.

When comparing model outputs, It's essential that the inputs are exactly the same. Any sort of image preprocessing (such as resizing) will change the output. I suggest doing this comparison in Python, rather than with Swift and Xcode, to eliminate one possible source of preprocessing.

If you're certain the inputs are the same and you're still seeing difference in the output, try removing layers from your TensorFlow model. Then convert the updated model and see if the predictions match. With this approach, you can hopefully isolate the problem.

iulianflst commented 2 years ago

I don't understand the part with resizing.

When I'm getting the images, I'm resizing all of them to 192x192

image_size = (192, 192)
batch_size = 32

train_ds = tf.keras.utils.image_dataset_from_directory(

    "/images",
 validation_split=0.2,
  subset="training",
  seed=123,
  image_size=image_size,
  batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
    "timbre/images",
    validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=image_size,
  batch_size=batch_size)

The normalization layer is defined as this . I believe this is the scaling parameter, I tried to set 1./255 and no good results

normalization_layer = layers.Rescaling(1./255)

I also resize the image after I take it from camera on swift, so input is 192x192 as well on swift

These are my layers:


model = Sequential([
  layers.Rescaling(1./255, input_shape=(image_size[1], image_size[0], 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'), 
  layers.Dense(256, activation='relu'),
  layers.Dense(512, activation='relu'),
  layers.Dense(num_classes)
])

So you are saying that I should disable all of them and try without all of them and start adding one by one until I can see the layer with the problem?

I'm only following this tutorial: https://www.tensorflow.org/tutorials/images/classification and is very frustrating that after more than 1 week I can't make it work properly on ios.

manhnguyen92 commented 2 years ago

I think you are using conversion for PyTorch model. With Tensorflow model it have different way.

iulianflst commented 2 years ago

You use the same code, additional parameters are not connected to pytorch. This is a generic API as far as I can see. I tested the mlcode model with python and most of the time works the same compared to ios always fails

TobyRoseman commented 2 years ago

Using your model the output matches on the Python side:

import coremltools as cmt
import numpy as np
from tensorflow.keras import Sequential, layers

image_size = (192, 192)
num_classes = 2

model_tf = Sequential([
  layers.Rescaling(1./255, input_shape=(image_size[1], image_size[0], 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'), 
  layers.Dense(256, activation='relu'),
  layers.Dense(512, activation='relu'),
  layers.Dense(num_classes)
])

np.random.seed(123)
x = np.random.rand(1, *image_size, 3)

model_cm = cmt.convert(model_tf, inputs=[cmt.TensorType(shape=x.shape)])

y_cm = model_cm.predict({"rescaling_input": x})["Identity"]
y_tf = model_tf(x).numpy()

print(y_cm)
print(y_tf)

y_cm and y_tf have the same values.

If you're seeing something different when using the model from Swift, the input to the model must be different. Perhaps Swift is doing the resize differently. I suggest discussing this in the Apple Developer Forum.

How large are the differences that you're seeing?

iulianflst commented 2 years ago

yes but I'm also using the autotune

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
normalization_layer = layers.Rescaling(1./255)

normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
niter = iter(normalized_ds)
image_batch, labels_batch = next(niter)
first_image = image_batch[0]
# Notice the pixel values are now in `[0,1]`.
print(np.min(first_image), np.max(first_image))

When I comment on this part in python I get ~ good results and I believe this is because of normalization: normalization_layer = layers.Rescaling(1./255)

Now for swift, I think I understand what is going on there This is the buffer function I'm using

 func buffer(with size:CGSize) -> CVPixelBuffer? {
          if let image = self.cgImage {
              let frameSize = size
              var pixelBuffer:CVPixelBuffer? = nil
              let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(frameSize.width), Int(frameSize.height), kCVPixelFormatType_24ARGB , nil, &pixelBuffer)
              if status != kCVReturnSuccess {
                  return nil
              }
              CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags.init(rawValue: 0))
              let data = CVPixelBufferGetBaseAddress(pixelBuffer!)
              let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
              let bitmapInfo = CGBitmapInfo(rawValue: CGBitmapInfo.byteOrder32Little.rawValue | CGImageAlphaInfo.premultipliedFirst.rawValue)
              let context = CGContext(data: data, width: Int(frameSize.width), height: Int(frameSize.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: bitmapInfo.rawValue)
              context?.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
              CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))

              return pixelBuffer
          }else{
              return nil
          }
      }

And I think is because of the way the buffer is created

kCVPixelFormatType_24RGBA

If I print the mlmodel I get


input {
  name: "sequential_1_input"
  type {
    imageType {
      width: 64
      height: 64
      colorSpace: RGB
    }
  }
}
output {....

So this means the input must be rgb

But it doesn't say if is 24 or 32

If I drag and drop images in the model previwer works perfect with the code commented(see up)

iulianflst commented 2 years ago

I finally figure it out. It was using latest version of sci-lab, and I downgraded to the version requested and it actually works (the AUTOTUNE )

as for coreml the issue was the buffer, so garbage in garbage out, goo stuff in, good stuff out :) Always make sure you don't put garbage :)

Thank you all, I finally make it work, I'm so happy!

apple / coremltools

CoreML Tools doesn't convert the model properly. #1595

❓Question

Set model author name

Set the license of the model

Set a short description for the Xcode UI

Set a version for the model