mperezcarrasco / Pytorch-VaDE

My implementation of Variational Deep Embedding (VaDE) using Pytorch
4 stars 3 forks source link

reduce failed to synchronize: cudaErrorAssert: device-side assert triggered #1

Open jbmaxwell opened 4 years ago

jbmaxwell commented 4 years ago

I'm seeing the above error trying to run the basic MNIST training. I've noticed the same thing with multiple PyTorch implementations of VaDE and I'm wondering if you've seen this problem and whether you might have any ideas about a fix? In all cases it's related to the binary_cross_entropy() loss.

mperezcarrasco commented 4 years ago

Hi,

What pytorch version are you using? When you computing p_c_given_z try to use a greater value for epsilon. Maybe the very small value of epsilon causes problems in precision making the output to be Nan.

Best,

El 2020-05-02 13:20, jbmaxwell escribió:

I'm seeing the above error trying to run the basic MNIST training. I've noticed the same thing with multiple PyTorch implementations of VaDE and I'm wondering if you've seen this problem and whether you might have any ideas about a fix?

-- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub [1], or unsubscribe [2].

Links:

[1] https://github.com/mperezcarrasco/Pytorch-VaDE/issues/1 [2] https://github.com/notifications/unsubscribe-auth/AGI56KP43BJONGUU5T2Y72LRPRI4XANCNFSM4MXYPA6A

-- Manuel Pérez-Carrasco MSc. Computer Science Student, University of Concepción. Concepción, Chile

jbmaxwell commented 4 years ago

Okay, thanks. Yes, it runs after increasing from 1e-9 to 1e-7 (1e-8 still hit the assert). I'm on PyTorch 1.4.0.