RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

cumthxy commented 5 years ago

how can i solve this error? i run the command: python train.py --data=ubuntu --model=VHRED --batch_size=40 --word_drop=0.25 --kl_annealing_iter=250000

cumthxy commented 5 years ago

and i have relase the batch size 16,but，there are still the same problems. 53%|#####################9 | 131/245 [01:21<01:10, 1.61it/s]THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "train.py", line 53, in solver.train() File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/utils/time_track.py", line 18, in timed result = method(*args, **kwargs) File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/solver.py", line 536, in train self.validation_loss = self.evaluate() File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/solver.py", line 624, in evaluate target_sentence_length) File "/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling/model/layers/loss.py", line 27, in masked_cross_entropy log_probs_flat = F.log_softmax(logits_flat, dim=1) File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 944, in log_softmax return torch._C._nn.log_softmax(input, dim) RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

xiaoweiweixiao commented 3 years ago

The problem may be caused by the max_conversation_len or n_context. I change max_conversation_len=30 to max_conversation_len=20, the problem is solved. Or you can run you program on a new GPU whose Memory-Usage is bigger than 11441MiB.

ctr4si / A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling

RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58 #4