ctr4si / A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling

PyTorch Implementation of "A Hierarchical Latent Structure for Variational Conversation Modeling" (NAACL 2018 Oral)
MIT License
174 stars 45 forks source link

RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58 #4

Open cumthxy opened 5 years ago

cumthxy commented 5 years ago

how can i solve this error? i run the command: python train.py --data=ubuntu --model=VHRED --batch_size=40 --word_drop=0.25 --kl_annealing_iter=250000

my computer has a 64m of memory,and GPU has: NVIDIA-SMI 390.48 Driver Version: 390.48 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| +-------------------------------+----------------------+----------------------+ | 1 Tesla K40c Off | 00000000:02:00.0 Off | 0 | | 23% 34C P8 23W / 235W | 0MiB / 11441MiB | 0% Default | +-------------------------------+----------------------+----------------------+

cumthxy commented 5 years ago

and i have relase the batch size 16,but,there are still the same problems. 53%|#####################9 | 131/245 [01:21<01:10, 1.61it/s]THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "train.py", line 53, in solver.train() File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/utils/time_track.py", line 18, in timed result = method(*args, **kwargs) File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/solver.py", line 536, in train self.validation_loss = self.evaluate() File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/solver.py", line 624, in evaluate target_sentence_length) File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/layers/loss.py", line 27, in masked_cross_entropy log_probs_flat = F.log_softmax(logits_flat, dim=1) File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 944, in log_softmax return torch._C._nn.log_softmax(input, dim) RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

xiaoweiweixiao commented 3 years ago

The problem may be caused by the max_conversation_len or n_context. I change max_conversation_len=30 to max_conversation_len=20, the problem is solved. Or you can run you program on a new GPU whose Memory-Usage is bigger than 11441MiB.