JamesRitchie / scikit-rvm

Relevance Vector Machine implementation using the scikit-learn API.
229 stars 73 forks source link

Found array with 0 sample #9

Closed siavashserver closed 6 years ago

siavashserver commented 7 years ago

Hi. Why am I getting this error:

Traceback (most recent call last):
  File "test.py", line 10, in <module>
    print(clf.predict([[1, 1]]))
  File "/usr/lib/python3.5/site-packages/skrvm/rvm.py", line 201, in predict
    phi = self._apply_kernel(X, self.relevance_)
  File "/usr/lib/python3.5/site-packages/skrvm/rvm.py", line 82, in _apply_kernel
    phi = linear_kernel(x, y)
  File "/usr/lib/python3.5/site-packages/sklearn/metrics/pairwise.py", line 729, in linear_kernel
    X, Y = check_pairwise_arrays(X, Y)
  File "/usr/lib/python3.5/site-packages/sklearn/metrics/pairwise.py", line 111, in check_pairwise_arrays
    warn_on_dtype=warn_on_dtype, estimator=estimator)
  File "/usr/lib/python3.5/site-packages/sklearn/utils/validation.py", line 416, in check_array
    context))
ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required by check_pairwise_arrays.

with following code:

from skrvm import RVR

X = [[2.1, 4], [2, 2]]
y = [0.5, 2.5 ]

clf = RVR(kernel='linear')

clf.fit(X, y)

print(clf.predict([[1, 1]]))

Am I using it correctly?

Alpus commented 7 years ago

@siavashserver, you can use bias_used=False in class initialization. It helped me with the same exception when I used RVC. Unfortunately, I can't explain why it helped and what it will change in the algorithm abstractly. I just read the sources and found what the cause of exception.

So, it should work:

from skrvm import RVR

X = [[2.1, 4], [2, 2]]
y = [0.5, 2.5 ]

clf = RVR(kernel='linear', bias_used=False)

clf.fit(X, y)

print(clf.predict([[1, 1]]))
siavashserver commented 7 years ago

@Alpus Thank you very much for help, that indeed gets rid of the nasty error! :)

I just gave the code sample on home page with bias_used=False a try and it returns 1.20 instead of 1.49. No idea about its effect on more complex data and other kernels, should give it a try later.

For future Googlers, there is also https://github.com/AmazaspShumik/sklearn-bayes as an actively maintained alternative.

woctezuma commented 5 years ago

As for sklearn-bayes, the code is not compatible with Python 3.7.


In my experience, the error only arises if kernel='linear' and if the dimension of features is equal to 1. As mentioned above, the error is not triggered when bias_used is set to False. Another solution would be to add a dummy dimension to the features.


For instance, the following code:

import matplotlib.pyplot as plt
import numpy as np
from skrvm import RVR

# parameters
n = 500

# generate data set
np.random.seed(0)
X = np.ones([n, 1])
X[:, 0] = np.linspace(-5, 5, n)
y = 10 * np.sinc(X[:, 0]) + np.random.normal(0, 1, n)

# train rvr
rvm = RVR()
rvm.fit(X, y)
y_hat = rvm.predict(X)

# plot test vs predicted data
plt.figure()
plt.plot(X[:, 0], y, "b+", markersize=3, label="test data")
plt.plot(X[:, 0], y_hat, "rD", markersize=3, label="mean of predictive distribution")
plt.show()

returns: plot with a non-linear kernel

However, with a linear kernel:

import matplotlib.pyplot as plt
import numpy as np
from skrvm import RVR

# parameters
n = 500

# generate data set
np.random.seed(0)
X = np.ones([n, 1])
X[:, 0] = np.linspace(-5, 5, n)
y = 10 * np.sinc(X[:, 0]) + np.random.normal(0, 1, n)

# train rvr
rvm = RVR(kernel='linear')
rvm.fit(X, y)
y_hat = rvm.predict(X)

# plot test vs predicted data
plt.figure()
plt.plot(X[:, 0], y, "b+", markersize=3, label="test data")
plt.plot(X[:, 0], y_hat, "rD", markersize=3, label="mean of predictive distribution")
plt.show()

returns:

ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required by check_pairwise_arrays.


First solution:

import matplotlib.pyplot as plt
import numpy as np
from skrvm import RVR

# parameters
n = 500

# generate data set
np.random.seed(0)
X = np.ones([n, 1])
X[:, 0] = np.linspace(-5, 5, n)
y = 10 * np.sinc(X[:, 0]) + np.random.normal(0, 1, n)

# train rvr
rvm = RVR(kernel='linear', bias_used=False)
rvm.fit(X, y)
y_hat = rvm.predict(X)

# plot test vs predicted data
plt.figure()
plt.plot(X[:, 0], y, "b+", markersize=3, label="test data")
plt.plot(X[:, 0], y_hat, "rD", markersize=3, label="mean of predictive distribution")
plt.show()

returns no bias


Second solution:

import matplotlib.pyplot as plt
import numpy as np
from skrvm import RVR

# parameters
n = 500

# generate data set
np.random.seed(0)
X = np.ones([n, 2]) # <--- I have added a dummy second feature dimension.
X[:, 0] = np.linspace(-5, 5, n)
y = 10 * np.sinc(X[:, 0]) + np.random.normal(0, 1, n)

# train rvr
rvm = RVR(kernel='linear') # <--- I have removed bias_used=False
rvm.fit(X, y)
y_hat = rvm.predict(X)

# plot test vs predicted data
plt.figure()
plt.plot(X[:, 0], y, "b+", markersize=3, label="test data")
plt.plot(X[:, 0], y_hat, "rD", markersize=3, label="mean of predictive distribution")
plt.show()

returns dummy feature dimension

Notice the slight slope of the line, likely due to the bias.

woctezuma commented 5 years ago

The error arises because of this check in scikit-learn.

As mentioned in the documentation, arrays are expected to be at least 2-dimensional.

    def check_pairwise_arrays(X, Y, precomputed=False, dtype=None):
    """ Set X and Y appropriately and checks inputs

    Specifically, this function first ensures that both X and Y are arrays,
    then checks that they are at least two dimensional while ensuring that
    their elements are floats (or dtype if provided). Finally, the function
    checks that the size of the second dimension of the two arrays is equal, or
    the equivalent check for a precomputed distance matrix.
siavashserver commented 5 years ago

@woctezuma I ended up writing my own implementation for my master's degree thesis: neonrvmSHAMELESS-SELF-ADVERTISEMENT

It's written in C programming language + Python bindings. And to speedup learning process, training data can be fed incrementally.

One major problem with these methods (SVM/RVM) is dealing with singular matrices during factorization; and there is hyperparameters with big search spaces to tune where a slight change as small as 1e-3 can make a big difference in model performance.

I found gradient boosted decision trees (XGBoost, LightGBM, ...) to be more reliable and easier to use while comparing RVM with different machine learning methods.

woctezuma commented 5 years ago

Thanks for the links!