jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.37k stars 590 forks source link

[BUG] Error when trying to fit DenseHMM model using GPU #1117

Closed paoloart closed 2 weeks ago

paoloart commented 2 months ago

I have a dataset consisting in a sequence of symbols indicating the neuron that discharge in each bin of a given trial. I'm trying to perform an HMM analysis in order to evaluate which hidden states characterize each time bin of each trial. When trying to perform the analysis on CPU using your package, it all works fine, but when I tried to perform it on GPU in returns the following error:

Traceback (most recent call last):
  File "y:\Paolo\script utili\GPU_pomegranate\prova_hmm_gpu.py", line 29, in <module>
    model.fit([X1, X2])
  File "C:\Users\labrozzi\anaconda3\envs\hmm\Lib\site-packages\pomegranate\hmm\_base.py", line 630, in fit
    logp += self.summarize(X_, sample_weight=w_, priors=p_).sum()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\labrozzi\anaconda3\envs\hmm\Lib\site-packages\pomegranate\hmm\dense_hmm.py", line 608, in summarize
    X, emissions, sample_weight = super().summarize(X,
                                  ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\labrozzi\anaconda3\envs\hmm\Lib\site-packages\pomegranate\hmm\_base.py", line 707, in summarize
    emissions = _check_inputs(self, X, emissions, priors)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\labrozzi\anaconda3\envs\hmm\Lib\site-packages\pomegranate\hmm\_base.py", line 28, in _check_inputs
    emissions = model._emission_matrix(X, priors=priors)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\labrozzi\anaconda3\envs\hmm\Lib\site-packages\pomegranate\hmm\_base.py", line 298, in _emission_matrix
    logp = node.log_probability(X)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\labrozzi\anaconda3\envs\hmm\Lib\site-packages\pomegranate\distributions\categorical.py", line 192, in log_probability
    logps += self._log_probs[i][X[:, i]]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu this error seems strange because - i think - all the things are loaded on GPU. See the code example below.

I followed your guide to set up the script, which I will share below, and I think that all tensors that I give the model are actually on GPU, differently from what the above mentioned error suggests. I have installed torch 2.4.0 and cuda toolkit build "cuda_11.8.r11.8/compiler.31833905_0"

Below is a simplified example of the script I'm using, in which I put an example of all the input I'm giving to the model :

import numpy as np
import torch
from pomegranate.distributions import Categorical
from pomegranate.hmm import DenseHMM

### Probabilities for two categorical distributions
prob1 = torch.tensor([[1.0/9] * 9], device='cuda')  # Uniform distribution for the first state on GPU
prob2 = torch.tensor([[0.05, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.25, 0.10]], device='cuda')  # Custom distribution for the second state on GPU

### Create categorical distributions on GPU
d1 = Categorical(prob1).cuda()
d2 = Categorical(prob2).cuda()

### Transition probabilities, initial and final probabilities on GPU
edges = torch.tensor([[0.89, 0.1], [0.1, 0.9]], device='cuda')
starts = torch.tensor([0.5, 0.5], device='cuda')
ends = torch.tensor([0.01, 0.0], device='cuda')

### Load the sequence of symbols and convert it to PyTorch tensors on GPU
X = np.load('X_fit_array.npy')
n = len(X) // 2 
X1 = torch.tensor(X[:n], dtype=torch.int64, device='cuda')
X2 = torch.tensor(X[n:], dtype=torch.int64, device='cuda')

### Create the model on GPU
model = DenseHMM([d1, d2], edges=edges, starts=starts, ends=ends, verbose=True).cuda()
print(X, X.shape)

### Train the model on GPU
model.fit([X1, X2])

The "X" dataset that I'm using instead is built like this: [[0] [0] [0] ... [6] [0] [0]]

with shape (21908, 1)

What would be the problem here? Is it possible that the error is linked to the fact that I'm using "Categorical" to build the distributions?

paoloart commented 1 month ago

Any suggestions?

jmschrei commented 2 weeks ago

Oops. I didn't create probs on the same device as the object, just on the CPU. https://github.com/jmschrei/pomegranate/blob/master/pomegranate/distributions/categorical.py#L185

This should be fixed in v1.1.1, which you can get from pip now. Sorry for the delay. Please re-open if issues persist.