apple / coremltools

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
https://coremltools.readme.io
BSD 3-Clause "New" or "Revised" License
4.46k stars 647 forks source link

Unrecognized sequence type in KNearestNeighbors #1206

Open smrfeld opened 3 years ago

smrfeld commented 3 years ago

🐞Describe the bug

I am trying to test a KNearestNeighbors classifier. The classifier is made using KNearestNeighborsClassifierBuilder. When I load and test the mlmodel file, I encounter the error "RuntimeError: Error: Unrecognized sequence type."

Trace

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-48-e33f53ea43fd> in <module>
      1 test_input = np.random.rand(20)
----> 2 model.predict({'my_in': test_input})

~/opt/anaconda3/lib/python3.7/site-packages/coremltools/models/model.py in predict(self, data, useCPUOnly, **kwargs)
    326 
    327         if self.__proxy__:
--> 328             return self.__proxy__.predict(data, useCPUOnly)
    329         else:
    330             if _macos_version() < (10, 13):

RuntimeError: Error: Unrecognized sequence type.

To Reproduce

Here is a minimal example: first create the classifier:

from coremltools.models.nearest_neighbors import KNearestNeighborsClassifierBuilder
from coremltools.models.utils import save_spec

my_inputs = np.random.rand(10,20)
my_outputs = (5* np.random.rand(10)).astype(int)

builder = KNearestNeighborsClassifierBuilder(
    input_name='my_in',
    output_name='my_out',
    number_of_dimensions=my_inputs.shape[1],
    default_class_label=0,
    number_of_neighbors=20
    )

save_spec(builder.spec, "test.mlmodel")

Then load and test it:

model = coremltools.models.MLModel('test.mlmodel')
test_input = np.random.rand(20)
model.predict({'my_in': test_input}) # Here the error is thrown

System environment (please complete the following information):

Related

Related #898 was closed but no explanation.

TobyRoseman commented 3 years ago

Thanks for the minimal example. Taking a look at print(model), I see that the only input to the model (my_in) is a multiArray of FLOAT32 with shape 20. However np.random.rand(20) returns a tuple with just one element. That one element is an array of length 20 but it's float64 not float32.

So test_input should be: np.random.rand(20).astype('float32')[0] Rather than: np.random.rand(20).

smrfeld commented 3 years ago

Thanks for the response, but it did not resolve the issue:

The input np.random.rand(20).astype('float32')[0] would be a single number, which doesn't make sense, the input should be an array of length 20.

~/opt/anaconda3/lib/python3.7/site-packages/coremltools/models/model.py in predict(self, data, useCPUOnly, **kwargs) 326 327 if self.proxy: --> 328 return self.proxy.predict(data, useCPUOnly) 329 else: 330 if _macos_version() < (10, 13):

RuntimeError: value type not convertible


- Finally if I use the correct input **and** explicitly set the type to `float32`:

test_input = np.random.rand(20).astype('float32') model.predict({'my_in': test_input})


I still get the original `Error: Unrecognized sequence type.` error.

Thanks
smrfeld commented 3 years ago

@TobyRoseman can you re-open this issue? Thanks

TobyRoseman commented 3 years ago

I’m seeing different behavior from numpy. What version of numpy are you using?

smrfeld commented 3 years ago

I'm on numpy 1.19.5

TobyRoseman commented 3 years ago

I can reproduce this issue with numpy 1.19.5 and coremltools 4.1 on macOS 11.

However this problem seems to have already been fixed. If you upgrade to our latest pre-release 5.0b1 and upgrade to the latest numpy, you no longer get that error. Note: you will need to regenerate the mlmodel. Also with the latest version of numpy, the input predict should be: {'my_in': np.random.rand(20)[0].astype('float32')} Rather than: {'my_in': np.random.rand(20).astype('float32')}

However using the example code, the call to predict just hangs. I guessing this is due to the fact that the model isn't really defined and the behavior will be different on a model which is defined.

smrfeld commented 3 years ago

First: thanks for all your help, I really appreciate it.

Somehow, we're still getting different results, even after I updated numpy to 1.20.3 (which looks to be the latest) and coremltools to 5.0b1. Here is a complete example I tried to run.

from coremltools.models.nearest_neighbors import KNearestNeighborsClassifierBuilder
from coremltools.models.utils import save_spec
import numpy as np
import coremltools

# Spit out the versions
print(np.__version__) # output: 1.20.3
print(coremltools.__version__) # output: 5.0b1

# Make the model
my_inputs = np.random.rand(10,20).astype('float32')
my_outputs = (5* np.random.rand(10)).astype(int)

builder = KNearestNeighborsClassifierBuilder(
    input_name='my_in',
    output_name='my_out',
    number_of_dimensions=my_inputs.shape[1],
    default_class_label=0,
    number_of_neighbors=20
    )

# Add data
builder.add_samples(
    data_points=my_inputs,
    labels=my_outputs
)

# Save the model
save_spec(builder.spec, "test.mlmodel")

# Load the model
model = coremltools.models.MLModel('test.mlmodel')

# Make the prediction
model.predict({'my_in': np.random.rand(20)[0].astype('float32')}) # Error here

Please note that I added the data to the model with builder.add_samples. Here is the error that I get from the last line - first it prints:

Error: value type not convertible:
0.38510618

which looks like good sign, but I also get a big red error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-27-9bb443041eca> in <module>
----> 1 model.predict({'my_in': np.random.rand(20)[0].astype('float32')})

~/opt/anaconda3/lib/python3.7/site-packages/coremltools/models/model.py in predict(self, data, useCPUOnly, **kwargs)
    384 
    385         if self.__proxy__:
--> 386             return self.__proxy__.predict(data, useCPUOnly)
    387         else:
    388             if _macos_version() < (10, 13):

RuntimeError: value type not convertible

I still don't understand the [0] subscript. When I print out the model, I see:

input {
  name: "my_in"
  type {
    multiArrayType {
      shape: 20
      dataType: FLOAT32
    }
  }
}

So it expects an array of size 20 as input. When I run np.random.rand(20).astype('float32') I get:

array([0.01490434, 0.81673574, 0.66146064, 0.78613   , 0.34884384,
       0.24923398, 0.75416636, 0.9488416 , 0.74992746, 0.92198753,
       0.60052043, 0.53834635, 0.83138   , 0.79459745, 0.0137482 ,
       0.40938067, 0.7572621 , 0.75679594, 0.45067838, 0.20595431],
      dtype=float32)

which seems correct because it is an array of size 20, and it matches the documentation under v1.20. When I use this array as input I get the original error:

model.predict({'my_in': np.random.rand(20).astype('float32')})

gives:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-34-f0521e7e15c0> in <module>
----> 1 model.predict({'my_in': np.random.rand(20).astype('float32')})

~/opt/anaconda3/lib/python3.7/site-packages/coremltools/models/model.py in predict(self, data, useCPUOnly, **kwargs)
    384 
    385         if self.__proxy__:
--> 386             return self.__proxy__.predict(data, useCPUOnly)
    387         else:
    388             if _macos_version() < (10, 13):

RuntimeError: Error: Unrecognized sequence type.

Sorry for all the headaches, we are just trying to do some testing for models intended for production on iOS.

benjaminkech commented 1 month ago

I'm having the same issue with coremltools 8.0and masOS 15 and numpy 2.1.2.