Closed amiltonwong closed 5 years ago
Hi, @WangYueFt ,
I can DCP-v1 without any issues. However, when I run
python main.py --exp_name=dcp_v2 --model=dcp --emb_nn=dgcnn --pointer=transformer --head=svd
, I got the following GPU memory errorFile "/root/anaconda3/envs/pytorch1.0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/data/code9/dcp/model.py", line 232, in forward dropout=self.dropout) File "/data/code9/dcp/model.py", line 27, in attention scores = torch.matmul(query, key.transpose(-2, -1).contiguous()) / math.sqrt(d_k) RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 11.91 GiB total capacity; 10.80 GiB already allocated; 5.44 MiB free; 64.97 MiB cached)
How large GPU memory is required to train dcp_v2? According to your paper, GTX 1070 GPU (8GB) is used. But my system uses Titan XP (12GB).
THX!
Hi,
DCP-v2 was trained on two Tesla P100 while the inference time was tested on a single GTX 1070. You can reduce the batch size or use DCP-v1 with single Titan XP.
Best, Yue
Thanks, Yue,
I change it to --batch_size=8
(which consumes around 7.3GB GPU memory), and the training can proceed.
Hi, @WangYueFt ,
I can DCP-v1 without any issues. However, when I run
python main.py --exp_name=dcp_v2 --model=dcp --emb_nn=dgcnn --pointer=transformer --head=svd
, I got the following GPU memory errorHow large GPU memory is required to train dcp_v2? According to your paper, GTX 1070 GPU (8GB) is used. But my system uses Titan XP (12GB).
THX!