dorienh / jesse

4 stars 0 forks source link

RuntimeError: CUDA error: device-side assert triggered #4

Closed dorienh closed 3 years ago

dorienh commented 3 years ago

On GPU with BCELoss (and focalloss):

criterion = torch.nn.BCELoss()

Training With 192 Features
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-12-20bc00643d0f> in <module>()
      5     optimizer=optimizer,
      6     criterion=criterion,
----> 7     checkpointdir=checkpointdir,
      8 )

6 frames
/content/drive/My Drive/herremans_data/src/wavenetmodel.py in train(self, epochs, dataloader, optimizer, criterion, checkpointdir)
     75     ):
     76         valid_loss_min = np.Inf
---> 77         self.model.to(self.device)
     78         self.model.float()
     79         for epoch in range(1, epochs + 1):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
    671             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    672 
--> 673         return self._apply(convert)
    674 
    675     def register_backward_hook(

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    385     def _apply(self, fn):
    386         for module in self.children():
--> 387             module._apply(fn)
    388 
    389         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    385     def _apply(self, fn):
    386         for module in self.children():
--> 387             module._apply(fn)
    388 
    389         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    385     def _apply(self, fn):
    386         for module in self.children():
--> 387             module._apply(fn)
    388 
    389         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    407                 # `with torch.no_grad():`
    408                 with torch.no_grad():
--> 409                     param_applied = fn(param)
    410                 should_use_set_data = compute_should_use_set_data(param, param_applied)
    411                 if should_use_set_data:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in convert(t)
    669                 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
    670                             non_blocking, memory_format=convert_to_format)
--> 671             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    672 
    673         return self._apply(convert)

RuntimeError: CUDA error: device-side assert triggered
Luckygyana commented 3 years ago

I am looking for this error. Tonight I will solve it by uploading it into colab

dorienh commented 3 years ago

Thanks!

Check some of the smaller items in the project board too. Thanks! Eg that the loss adjustment works for all models etc.

On Fri, 21 May 2021 at 20:59, Gyanendra Das @.***> wrote:

I am looking for this error. Tonight I will solve it by uploading it into colab

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dorienh/jesse/issues/4#issuecomment-845932453, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEES656RWY6RX4HAO45XPDTOZKKBANCNFSM45IUFKWQ .

-- Sent from my mobile device.

Luckygyana commented 3 years ago

I check with colab with some sample dataset and It's running perfectly fine.

Open In Colab

Sample Data Used

Please check

dorienh commented 3 years ago

Dunno what you did but cuda is working for me now!

Luckygyana commented 3 years ago

Great :)