AlexanderFabisch / gmr

Gaussian Mixture Regression
https://alexanderfabisch.github.io/gmr/
BSD 3-Clause "New" or "Revised" License
168 stars 49 forks source link

AttributeError: 'list' object has no attribute 'shape' #34

Closed raul-parada closed 3 years ago

raul-parada commented 3 years ago

I'm trying to predict a next step position (using latitude and longitude as attributes and target). I've tried the following:

pred = gmm.predict(len(X)+i, np.array([X[(num-1)+i]])) where the first value is 10 and the second "array([[41.4051453, 2.1776344]])" with shape (1, 2)

however, I get this error:

`AttributeError: 'list' object has no attribute 'shape'

What I'm doing wrong? `

AlexanderFabisch commented 3 years ago

This is not the error that I would expect. The first argument should be a list or an array though. I'll look into it.

AlexanderFabisch commented 3 years ago

I suspect that gmm.means is a list. Could you post a full example to reproduce this bug?

raul-parada commented 3 years ago

Please, find here a full example to reproduce:

import numpy as np
from gmr import GMM

X=np.array([[41.4049364,  2.177356 ],
       [41.4049656,  2.1773926],
       [41.4049938,  2.1774287],
       [41.4050204,  2.1774638],
       [41.4050453,  2.1774975],
       [41.4050682,  2.1775296],
       [41.4050895,  2.1775597],
       [41.4051093,  2.1775874],
       [41.4051278,  2.1776125],
       [41.4051453,  2.1776344]])

gmm = GMM(n_components=3, random_state=0)
gmm.from_samples(X)
X_sampled = gmm.sample(100)
pred = gmm.predict(len(X), np.array([X[(len(X)-1)]]))
AlexanderFabisch commented 3 years ago

pred = gmm.predict(len(X), np.array([X[(len(X)-1)]]))

What do you want to do here exactly? GMM.predict takes these two arguments:

        indices : array-like, shape (n_features_1,)
            Indices of dimensions that we want to condition.

        X : array-like, shape (n_samples, n_features_1)
            Values of the features that we know.

Suppose X[:, 0] are all values of the input feature from your training set and X[:, 1] are all values of the output from your training set. Then indices should be np.array([0]). The argument X should have the shape (1, 1) in this case, not (1, 2).

AlexanderFabisch commented 3 years ago

Example: y = gmm.predict(np.array([0]), np.array([[X[0, 0]]]))

AlexanderFabisch commented 3 years ago

I wrote down the mathematical background on page two here: https://github.com/openjournals/joss-papers/blob/joss.03054/joss.03054/10.21105.joss.03054.pdf

raul-parada commented 3 years ago

I want to predict the next position from X based on the latest sample of X. Should I use twice the gmm.predict? one for latitude and the other for longitude.

raul-parada commented 3 years ago

I've tried this:

pred = gmm.predict(len(X), np.array([X[(len(X)-1),0]])) pred2 = gmm.predict(len(X), np.array([X[(len(X)-1),1]]))

However, I get this error:


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-53-205b0ad5e44e> in <module>
     17 
     18 X_sampled = gmm.sample(100)
---> 19 pred = gmm.predict(len(X), np.array([X[(len(X)-1),0]]))
     20 pred2 = gmm.predict(len(X), np.array([X[(len(X)-1),1]]))
     21 

~\anaconda3\lib\site-packages\gmr\gmm.py in predict(self, indices, X)
    415         self._check_initialized()
    416 
--> 417         n_samples, n_features_1 = X.shape
    418         n_features_2 = self.means.shape[1] - n_features_1
    419         Y = np.empty((n_samples, n_features_2))

ValueError: not enough values to unpack (expected 2, got 1)
AlexanderFabisch commented 3 years ago

So I guess the previous sample is X[i] and the next one is X[i+1]. Then you are not using Gaussian mixture regression in the way that you want.

Think about how you would do this in a normal regression approach. You would use X[i] as an input and X[i+1] as an output. sklearn, for example, has to separate variables for that in their fit functions. A GMM is trained differently. We concatenate inputs and outputs along the 2nd axis to derive a joint distribution of inputs and outputs. In your case: X = np.hstack((X[:-1], X[1:])).

Next you have to condition the joint distribution on the input features (indices 0 and 1). The argument indices must be a list indicating the input features, not the sample index. An exemplary call to predict: gmm.predict(np.array([0, 1]), X[0, :2])

edit: btw. it looks like you are trying to do extrapolation. I'd assume that with nonlinear dynamics that wouldn't be possible at all.

edit2: here is a similar example that doesn't predict the next sample, but the difference between the current and the next sample (set sampling_dt=1): https://github.com/AlexanderFabisch/gmr/blob/master/examples/plot_time_invariant_trajectories.py

raul-parada commented 3 years ago

About this example you've put:

gmm.predict(np.array([0, 1]), X[0, :2]) That's exactly what I wanted. I get this error:

ValueError: not enough values to unpack (expected 2, got 1)

AlexanderFabisch commented 3 years ago

That's because the predict function expects an array of shape (n_samples, n_features), but the shape is (n_features,). This version works for me:

    states = np.array([[41.4049364, 2.177356],
                       [41.4049656, 2.1773926],
                       [41.4049938, 2.1774287],
                       [41.4050204, 2.1774638],
                       [41.4050453, 2.1774975],
                       [41.4050682, 2.1775296],
                       [41.4050895, 2.1775597],
                       [41.4051093, 2.1775874],
                       [41.4051278, 2.1776125],
                       [41.4051453, 2.1776344]])
    state_tuples = np.hstack((states[:-1], states[1:]))

    gmm = GMM(n_components=3, random_state=0)
    gmm.from_samples(state_tuples)
    next_state = gmm.predict(np.array([0, 1]), [states[-1]])
    assert_array_almost_equal(next_state, [[41.405162, 2.177657]])

On the current master branch, otherwise you have to cast all lists to numpy arrays.

raul-parada commented 3 years ago

It works!