apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.39k stars 633 forks source link

Segmentation Fault when calling model.predict #1059

Open ben-xD opened 3 years ago

ben-xD commented 3 years ago

Hello fellow developers 👋

🐞Describe the bug

Trace

No trace, just [1] 80595 segmentation fault python3 python_file_name.py

To Reproduce

First install dependencies: pip install tensorflow numpy keras-vggface coremltools keras_applications

import coremltools as ct
from keras_vggface import VGGFace
import numpy as np
from tensorflow.keras.preprocessing import image
from keras_vggface import utils

def create_core_ml_model():
    input = ct.TensorType(shape=(1, 224, 224, 3))
    keras_model = VGGFace(model="senet50", pooling="avg", include_top=False, input_shape=(224, 224, 3))
    coreml_model = ct.convert(keras_model, inputs=[input])
    coreml_model.save("model.mlmodel")

create_core_ml_model()

# Download a random image
image_path = "https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftinyjpg.com%2Fimages%2Fsocial%2Fwebsite.jpg&f=1&nofb=1"

import urllib.request

r = urllib.request.urlopen(image_path)
with open("image.jpg", "wb") as f:
    f.write(r.read())
img = image.load_img('image.jpg', target_size=(224, 224))

x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = utils.preprocess_input(x, version=2)

coreml_model = ct.models.MLModel("model.mlmodel")
output_dictionary = coreml_model.predict({"input_1": x}) <---- THIS IS WHERE IT SIGSEGV's without any other warnings.
output = output_dictionary["Identity"][0]
print("output: ", output)

System environment (please complete the following information):

ben-xD commented 3 years ago

Here is the converted model's "layer distribution from XCode" too, maybe using TensorType inputs along with some of these layers cause an issue? Screenshot 2021-01-14 at 22 16 09

Screenshot 2021-01-14 at 22 16 13

TobyRoseman commented 3 years ago

I can reproduce this issue.

I suspect this is an overflow issue. The Neural Network is quite long (320 layers). x also contains values as large as 163.

The following code works fine.

for _ in range(100):
    z = np.random.rand(1, 224, 224, 3)
    output_dictionary = coreml_model.predict({"input_1": z})

np.random.rand produces values between 0 and 1.

ben-xD commented 3 years ago

Interesting. I have to admit the output of the model doesn't appear to look well formed. Some values can be 1000+, and some 0.001. Let me know if there's anything I can help with if you wanted to do more work on this.

piraka9011 commented 1 year ago

Using the latest main @ 3569369, I also get this if my model seems to have too many dynamic shapes. Tested on Python 3.8 and 3.9, Mac OS 12.3.1 (MBP M1 Pro), Xcode 13.4.1 (Build version 13F100)

For me specifically, I have an acoustic model with the following input shapes:

Where Batch and SequenceLength are dynamic (modeled using ct.RangeDim()). If Batch is of size 1 however, I can run predict with no SIGSEGV

Example

import coremltools as ct
from nemo.collections.asr.models import EncDecCTCModelBPE
import torchaudio
import torch

pre_trained_model_name = "stt_en_citrinet_256"
model = EncDecCTCModelBPE.from_pretrained(pre_trained_model_name, map_location='cpu')
model.eval()
input_example = model.encoder.input_example()
example_input = input_example[0]
example_input_len = input_example[1]

# Does not segfault if shape=(1, example_input.shape[1], ct.RangeDim())
audio_signal_shape = ct.Shape(shape=(ct.RangeDim(), example_input.shape[1], ct.RangeDim()))
# Does not segfault if shape=(1,)
length_shape = ct.Shape(shape=(ct.RangeDim(),))

# NeMo Export
export_output_path = f"/tmp/{pre_trained_model_name}.ts"
model.export(
    export_output_path,
    check_trace=True,
    input_example=(example_input, example_input_len)
)

# CoreML Convert
scripted_model = torch.jit.load(export_output_path)
ct_model = ct.convert(
    scripted_model,
    convert_to="mlprogram",
    inputs=[
        ct.TensorType(name="audio_signal", shape=audio_signal_shape),
        ct.TensorType(name="length", shape=length_shape)
    ],
    outputs=[ct.TensorType(name="log_probs")],
    compute_units=ct.ComputeUnit.ALL,
)
ct_model_output_path = f"/tmp/{pre_trained_model_name}.mlpackage"
ct_model.save(ct_model_output_path)

# Testing
example_wav_file = "/path/to/audio.wav"
input_signal, sr = torchaudio.load(example_wav_file)
input_signal_shape = torch.tensor([input_signal.shape[1]])
processed_signal, processed_signal_length = model.preprocessor(
    input_signal=input_signal, length=input_signal_shape
)
# Or just use `example_input` and `example_input_len` instead of the audio file.
coreml_inputs = {
    "audio_signal": processed_signal.to(torch.int32).numpy(),
    "length": processed_signal_length.to(torch.int32).numpy(),
}
coreml_outputs = ct_model.predict(coreml_inputs)
log_probs = coreml_outputs['log_probs']

You can try it on any audio file here: https://huggingface.co/datasets/librispeech_asr (example) Might need to convert to wav using ffmpeg -i audio.mp3 -ar 16000 -ac 1 audio.wav. Or just use example_input and example_input_len instead of the audio file.

Is there a way we can debug this to figure out the root cause? @TobyRoseman It seems to be an issue w/ CoreML according to the crash report/stack trace the OS generated available here (also reported to Apple FWIW 🤷 )