Open pbanavara opened 8 months ago
Open the model in Netron and see what the input name/s are. Every model is different and inputs are matched based on name not order (i.e. you must know the exact name).
Most likely you'll need to use onnxruntime-c and onnxruntime-objc packages as those contain a full build of ONNX Runtime (supports most recent ONNX opsets and all operators).
You may also want to consider adding pre/post processing to the model. There's an end-to-end tutorial for yolov8-pose here: https://onnxruntime.ai/docs/tutorials/mobile/pose-detection.html.
Simply resizing the image isn't sufficient to provide input to the ONNX model. You need to convert from the original image format to RGB, the channels need to be first, data needs to be converted to float and normalized, and a batch dimension needs to be added so the input is 4D. A 640x640 image would become a 4D input of float data with shape {1, 3, 640, 640}, 3 being the channels in RBG order (not BGR which is the default ordering some image conversion produces).
@skottmckay Thank you. The name should be images as per the visualization i n Netron.
Now am working on changing the dimensions of the image.
I did see the tutorial. In fact I just copied the Android steps to iOS :). May be I missed something. Perhaps the rawImageBytes is already transformed to a 4D input here.
val shape = longArrayOf(rawImageBytes.size.toLong())
Converting an image to the required rank is turning out to be more complicated than I expected.
The data structure in iOS that supports this ranked structure somewhat like numpy is MLMultiArray and to convert the image data to MLMultiArray involves using pointers that I am not familiar with.
Even if I somehow managed to convert the image to MLMultiArray the ONNX API expects an NSMutableData. Absolutely no idea how to convert the MLMultiArray to NSMutableData and SO or the general web is of no help.
So I tried the onnx model that had pre and post processing already included and included the input and output names as per the Netron visualization.
However I get this message at the load model line
let ortSession = try ORTSession(env: ortEnv, modelPath: modelPath, sessionOptions: nil)
/Users/pbanavara/Library/Developer/CoreSimulator/Devices/955A6321-4431-448B-9405-3B46B1EC4440/data/Containers/Bundle/Application/D9B01188-77C5-4DD4-8346-F015E1DF83C4/onnx.app/yolov8n-pose-pre.onnx failed:Fatal error: com.microsoft.extensions:DecodeImage(-1) is not a registered function/op" UserInfo={NSLocalizedDescription=Load model from
EDIT: I installed the onnxruntime-extensions-c pod and built the pre-post-process onnx file as per this link
Still same error.
Greatly appreciate any help in resolving this.
To use onnxruntime-extensions-c, you'll need to register its custom ops with the session options object. Here's an example: https://github.com/microsoft/onnxruntime-extensions/blob/62bbcb59a22fdf45b40d45d3245224684c6a8cba/test/ios/OrtExtensionsUsage/OrtClient/OrtSwiftClient.swift#L16-L24
As for going from MLMultiArray to NSMutableData, you can get a void*
from MLMultiArray and pass it to one of the NSMutableData initializers. E.g.:
From MLMultiArray:
https://developer.apple.com/documentation/coreml/mlmultiarray/3929555-getmutablebyteswithhandler?language=objc
To NSMutableData: https://developer.apple.com/documentation/foundation/nsdata/1547231-datawithbytes?language=objc
@edgchen1 Thank you. Registering the custom ops works. Appreciate the help.
Another questions if someone can help out or give some pointers. I have used the preandpostprocessing pose model as per this link
I have fed the raw image to this model as per the above Swift code.
The raw image size is
Optional(4284.0) Optional(5712.0)
The Output tensor data as an array gives this
[0, 53, 17, 69, 81, 117, 93, 69, 224, 188, 38, 69, 76, 188, 87, 69, 118, 114, 90, 63, 0, 0, 0, 0, 163, 142, 188, 68, 167, 106, 45, 69, 18, 103, 50, 63, 176, 93, 202, 68, 105, 65, 38, 69, 196, 0, 9, 63, 30, 106, 179, 68, 137, 126, 41, 69, 108, 43, 7, 63, 162, 97, 231, 68, 90, 176, 20, 69, 8, 101, 121, 62, 41, 41, 168, 68, 42, 235, 26, 69, 248, 202, 195, 61, 106, 69, 16, 69, 117, 230, 4, 69, 64, 66, 64, 63, 164, 62, 171, 68, 26, 119, 24, 69, 82, 242, 82, 63, 96, 103, 79, 69, 140, 161, 16, 69, 4, 173, 65, 63, 201, 24, 185, 68, 17, 166, 40, 69, 88, 134, 105, 63, 47, 8, 78, 69, 100, 6, 67, 69, 224, 85, 76, 63, 69, 163, 182, 68, 223, 123, 66, 69, 244, 2, 101, 63, 141, 115, 31, 69, 139, 233, 49, 69, 8, 151, 126, 63, 218, 239, 234, 68, 122, 138, 61, 69, 158, 24, 127, 63, 253, 94, 44, 69, 70, 94, 78, 69, 220, 84, 126, 63, 19, 156, 176, 68, 57, 253, 117, 69, 231, 41, 127, 63, 79, 33, 62, 69, 48, 148, 99, 69, 223, 0, 124, 63, 101, 110, 192, 68, 164, 212, 145, 69, 132, 44, 125, 63]
Based on this line in yolov8_pose_e2e.py
(box, score, _, keypoints) = np.split(result, (4, 5, 6))
The first 4 indices are that of the bounding box, index 5 score, index 6 - class and indices 6:end are keypoints
The shape of the output tensor is (1, 57) => 51 entries for the keypoints, (x, y, confidence) 4 entreies for bounding box, one for score and one for class, which is perfect
However the length of the data array above is 228 228 - 6 = 222 Don't know how to interpret this 222 length. It should be 51 for the 17 keypoints. Can someone please explain ?
Next am not sure if these are scaled values because if I just plot these points on the image, the points are way off. Like the first 4 points for the bbox.
This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.
Describe the issue
Trying out the yolov8 pose models for iOS using ONNX. I have exported the model to ONNX using the following code.
I am trying to run this model on an image in iOS. Here is the crux of the code for running ONNX on the image. I resized the image to 640*640
I get the following error at the ortSession.run line
Error Domain=onnxruntime Code=2 "Invalid Feed Input Name:input" UserInfo={NSLocalizedDescription=Invalid Feed Input Name:input}
No idea what this means. Can anyone please help ? I tried renaming the string "input" to "image" etc but get the same error.Environment Xcode 15.2 MBPro M3 Max chip iPhone 15 simulator min iOS version 17.2
To reproduce
Clone the repo, set your developer acccount info and run
Urgency
No response
Platform
iOS
OS Version
17.2
ONNX Runtime Installation
Released Package
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
onnxruntime-mobile-objc/onnxruntime-mobile-c
ONNX Runtime Version or Commit ID
1.16.0
ONNX Runtime API
Objective-C/Swift
Architecture
ARM64
Execution Provider
CoreML
Execution Provider Library Version
No response