Closed salpers closed 4 months ago
I experimented with changing the data, however the issue is also reproducible with random small data.
import numpy as np
from pomegranate.markov_chain import MarkovChain
np.random.seed(137)
seq_data = np.random.randint(0, 10, (1,10,1))
model = MarkovChain(k = 1)
model.fit(seq_data)
throws
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[99], line 5
2 seq_data = np.random.randint(0, 10, (1,6,1))
4 model = MarkovChain(k = 1)
----> 5 model.fit(seq_data)
File /opt/conda/lib/python3.10/site-packages/pomegranate/markov_chain.py:216, in MarkovChain.fit(self, X, sample_weight)
193 def fit(self, X, sample_weight=None):
194 """Fit the model to optionally weighted examples.
195
196 This method will fit the provided distributions given the data and
(...)
213 self
214 """
--> 216 self.summarize(X, sample_weight=sample_weight)
217 self.from_summaries()
218 return self
File /opt/conda/lib/python3.10/site-packages/pomegranate/markov_chain.py:276, in MarkovChain.summarize(self, X, sample_weight)
274 for i in range(X.shape[1] - self.k):
275 j = i + self.k + 1
--> 276 distribution.summarize(X[:, i:j], sample_weight=sample_weight)
File /opt/conda/lib/python3.10/site-packages/pomegranate/distributions/conditional_categorical.py:168, in ConditionalCategorical.summarize(self, X, sample_weight)
165 strides = torch.tensor(self._xw_sum[j].stride(), device=X.device)
166 X_ = torch.sum(X[:, :, j] * strides, dim=-1)
--> 168 self._xw_sum[j].view(-1).scatter_add_(0, X_, sample_weight[:,j])
169 self._w_sum[j][:] = self._xw_sum[j].sum(dim=-1)
RuntimeError: index 42 is out of bounds for dimension 0 with size 28
Hi,
I got the same error. Have you been able to fix it in the meantime? Does anyone else have a suggestion?
I would really appreciate any help on this.
Thank you!
This should be fixed in v1.0.4. Please let me know if you encounter any other issues. In the future, if you run into challenges you can pass in n_categories
to the MarkovChain or make the list of distributions (one Categorical
and then a series of k
ConditionalCategorical
objects) yourself.
Hey there,
I try to fit a Markov Chain model on multivariate, categorical sequential data. After label encoding my sequences to integers, I pad them with 0 so they all have the same length. The resulting Tensor is of shape (932,132,3) - 932 Observations of length 132 (0 padded) with 3 features for each element.
However, I get an Index out of bounds error when I try to fit the model.
I would appreciate it if you could help me with the issue or point out any mistakes in my approach.