webonnx / wonnx

A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web
Other
1.61k stars 59 forks source link

Can't execute OpenNSFW model: "only length-1 arrays can be converted to Python scalars" #179

Closed redthing1 closed 1 year ago

redthing1 commented 1 year ago

Describe the bug After converting the OpenNSFW ONNX model, and fixing the input shape to be a constant (1,224,224,3), try to execute in WONNX.

This exact same model works fine in onnxruntime.

To Reproduce

Here is the model opennsfw_wonnx.zip

Steps to reproduce the behavior:

  1. Follow the instructions from the readme to install wonnx-py
  2. Inference like this:
def preprocess_image_v2(self, raw_img):
        # image here was loaded by cv2.imread, so it's already a numpy array

        # resize to 256, 256
        img = cv2.resize(raw_img, (256, 256))

        # crop the center 224x224 (take 16-240 on each axis)
        img = img[16:240, 16:240]

        # convert RGB to BGR
        img = img[:, :, ::-1]

        # subtract dataset mean
        img = img.astype(np.float32)
        img -= np.array([123.68, 116.779, 103.939], dtype=np.float32)

        # move image channels to outermost
        img = np.expand_dims(img, axis=0)

        return img

    def infer_v2(self, image_data: bytes) -> float:
        """Classifies the the specified image.

        Arguments:
            image_data:
                JPEG/PNG/BMP/WEBP data to classify.

        Returns:
            A score indicating how likely it is that
            this model contains NSFW content.
        """

        img = self.preprocess_image_v2(image_data)

        print(f"opennsfw inference input img shape: {img.shape}")

        inference_inputs = {"input": img}
        inference_output_names = ["output"]
        inference_outputs = self.session.run(
            inference_inputs
        )

        print("opennsfw inference_outputs shape: ", inference_outputs.shape)
        print("opennsfw inference_outputs: ", inference_outputs)

        return inference_outputs[0][1]

Expected behavior A clear and concise description of what you expected to happen.

I expect it to output what it outputs from onnxruntime, namely a (1,2) output.

Desktop (please complete the following information):

redthing1 commented 1 year ago

From what I can tell, this is because this code: https://github.com/webonnx/wonnx/blob/master/wonnx-py/src/lib.rs#L41

Seems to be expecting a Vec<f32>? But what about inputs that have a more complex shape that aren't just a flat vector? Like a 1,224,224,3 tensor?

redthing1 commented 1 year ago

I figured this out: you have to reshape the input to a flat array. Leaving this here for posterity.