microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.57k stars 2.92k forks source link

[iOS] Output of type sequence<map<int64,float32>> causes crash on iOS #19867

Open adam-fallon opened 7 months ago

adam-fallon commented 7 months ago

Describe the issue

I am using a model that is a SKLearn TreeEnsembleClassifier with two outputs - 1 of type int64 (works fine) and another of type sequence<map<int64,float32>> (causes crash) on macOS and iOS.

Screenshot 2024-03-12 at 14 37 02

I am using the Swift package that wraps the onnxruntime (https://github.com/microsoft/onnxruntime-swift-package-manager) and I tried both versions 1.16.0 and 1.17.0 and both have the same crash.

It seems like this should be supported - it works fine on Android - but have a suspicion that it doesn't work because perhaps the iOS implementation is using the C ONNX types under the hood which might not support this complex type - but that is a guess.

Here is the crash inside ort_value.m

Screenshot 2024-03-12 at 14 40 27

To reproduce

Here is the code, can't give you the model, but the Netron screenshot above should illustrate how to recreate the crash, you can make a simple TreeEnsembleClassifier - but I don't think you need to in order to give an answer as to why this crashes.

//
//  Predictor.swift
//  ONNX Test
//
//  Created by Adam Fallon on 07/03/2024.
//

import OnnxRuntimeBindings
import OnnxRuntimeExtensions

enum TrainerError: Error {
    case modelFileNotFound
    case modelInferenceFailed(String)
    case inferenceFailure(String)
}

public struct Predictor {
    public init() {}

    func predict(for inputArray: [[Float]]) throws -> [Int] {

        guard let modelPath = Bundle.main.path(
            forResource: "sklearn_model",
            ofType: "onnx"
        ) else {
            throw TrainerError.modelFileNotFound
        }

        let ortEnv = try ORTEnv(
            loggingLevel: ORTLoggingLevel.warning
        )
        let ortSession = try ORTSession(
            env: ortEnv,
            modelPath: modelPath,
            sessionOptions: nil
        )

        let inputData = inputArray
            .flatMap { $0 }
            .withUnsafeBufferPointer {
                Data(buffer: $0)
            }

        let numSamples = 1
        let numFeatures = 4

        let inputShape: [NSNumber] = [NSNumber(
            value: numSamples
        ), NSNumber(
            value: numFeatures
        )]

        guard let inputName = try ortSession.inputNames().first else {
            fatalError("Failed to get input node name")
        }

        let outputNames = try Set(ortSession.outputNames())

        let input = try ORTValue(
            tensorData: NSMutableData(
                data: inputData
            ),
            elementType: ORTTensorElementDataType.float,
            shape: inputShape
        )

        let outputs = try ortSession.run(
            withInputs: [inputName: input],
            outputNames: outputNames,
            runOptions: nil
        )

        guard
            let output = outputs["output_label"],
            let outputData = try? output.tensorData() as Data
        else {
            throw TrainerError.modelInferenceFailed("Failed to get model output from model.")
        }

        let output_label_predictions = outputData.withUnsafeBytes { (pointer: UnsafeRawBufferPointer) -> [Int] in
            let bufferPointer = pointer.bindMemory(to: Int.self)
            return Array(bufferPointer)
        }

        guard let max = output_label_predictions.max() else {
            throw TrainerError.inferenceFailure("Failed to get prediction from output label")
        }

        let output_label = [max]

        // CRASHES HERE!
        guard
            let shapeInfo = try outputs["output_probability"]?.tensorTypeAndShapeInfo(),
            let output = try outputs["output_probability"]?.tensorData()
        else {
            throw TrainerError.modelInferenceFailed("Failed to get model output from model.")
        }

        print(shapeInfo)
        print(output)

        return output_label
    }
}

Urgency

No - not urgent.

Platform

iOS

OS Version

iOS 17.0.1

ONNX Runtime Installation

Released Package

Compiler Version (if 'Built from Source')

-

Package Name (if 'Released Package')

onnxruntime-objc/onnxruntime-c

ONNX Runtime Version or Commit ID

1.16.0

ONNX Runtime API

Objective-C/Swift

Architecture

ARM64

Execution Provider

Default CPU, CoreML

Execution Provider Library Version

-

edgchen1 commented 7 months ago

Unfortunately, the Objective-C API doesn't support non-tensor types yet. The tensorTypeAndShapeInfo API assumes that the ORTValue is a tensor.

We can look into adding support for more types.

adam-fallon commented 7 months ago

Hey - that would be great to add support, but really what I was looking from by opening this question was a confirmation that I wasn't missing something obvious - which you've given here, much appreciated!

I know it is hard to answer - but would be good to get a sense for how much work this would be? Is it an astronomical amount or is it updating a few enums and adding a few mappers to bridge the types?

Also curious to know why it works on Android - was that a heavy lift or was there some reason it was easier to do on that platform?

edgchen1 commented 7 months ago

Is it an astronomical amount or is it updating a few enums and adding a few mappers to bridge the types?

I think it would be closer to the latter. The ORT C/C++ API (that the Objective-C API uses) should already have the necessary support for these types.

Also curious to know why it works on Android - was that a heavy lift or was there some reason it was easier to do on that platform?

The Java API happens to be more complete than the Objective-C API now. We've added to the Objective-C API as needed.

It's good to get feedback about desired functionality that's not supported yet though. Thanks. Will bring it up with the team.

edgchen1 commented 7 months ago

Do you have a production use case for a model with these unsupported types? Knowing this would help us prioritize the work for adding support.

adam-fallon commented 7 months ago

I will ask the MLEs I am working with tomorrow (UK Time) if they could provide a helpful answer here. For context I am an iOS developer helping to explore the possibility of running some our of existing and future models on device, so am evaluating a few model runtimes. I will come hopefully back with solid model use cases to help tomorrow.

Really appreciate your time answering questions here, and hope you have a nice evening!

adam-fallon commented 7 months ago

Hey @edgchen1 - just to close the loop on this, and to apologise for the delay - we're just talking internally about seeing if we could modify the sklearn -> onnx conversion to side step this problem. Not sure yet if that is possible, but going to try it.

iHarnastaeu commented 7 months ago

Hello! I've faced the same problem in my project. Can you please inform me about the timeline for a new release with support for other output types? Also, if there are any temporary solutions or workarounds, I'd appreciate your advice. Thank you!