Improve flexibility of input types for image classification

mg-yolo-enterprises commented 1 year ago

Is your feature request related to a problem? Please describe. The image classifier in ML.NET currently accepts input in the form of a byte array (only certain file formats, e.g. png jpeg gif), which is flexible enough for many scenarios. However, it is sometimes the case that the source image originates in a System.Image.Bitmap, OpenCV mat, or other managed/unmanaged memory block not in png/jpeg/gif format. To perform a prediction with this sort of data, it must be converted which takes time and requires allocations for data that already exists.

In a production computer vision environment where image data is streamed in as System.Drawing.Bitmap objects whose RawFormat is MemoryBmp, it is undesirable to dedicate 25-30ms per image to save each Bitmap into a managed byte[] whose RawFormat is jpeg (80ms for png), if it can be avoided by an update to allow more direct consumption of other formats and datatypes.

Describe the solution you'd like Direct consumption of System.Drawing.Bitmap (RawFormat=MemoryBmp) by ML.NET in a way which avoids managed allocations or format conversion would be ideal. Accepting an IntPtr to unmanaged pixel data would be fine as well, with support for bmp format.

Describe alternatives you've considered Various methods to convert Bitmap data into byte[] were tested and are functional, but none achieves the performance desired.

Additional context The general use case for this is for processing of images received via a camera which streams images in as Bitmap objects. Needing to convert this to managed memory in one of several formats is a bottleneck which would ideally be avoidable through the solution described above.

michaelgsharp commented 10 months ago

@luisquintanilla thoughts on this?

luisquintanilla commented 10 months ago

@michaelgsharp this issue is mainly related to image classification. If we update to using the new model, maybe this issue is no longer relevant and it can be addressed then. Let's add it to Future.

dotnet / machinelearning

Improve flexibility of input types for image classification #6590