Open sparkydogX opened 5 years ago
当显存非常大的时候,初始化变量将会消耗大量的时间。可以考虑将变量分成若干小块,以减少意外情况发生的几率
代码呢?
@sgKevin1 代码呢?
如果无法正常显示代码,可以直接访问https://gist.github.com/sparkydogX/845b658e3e6cef58a7bf706a9f43d7bf 或者直接使用以下代码:
import os
import torch
from tqdm import tqdm
import time
# declare which gpu device to use
cuda_device = '0'
def check_mem(cuda_device):
devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n")
total, used = devices_info[int(cuda_device)].split(',')
return total,used
def occumpy_mem(cuda_device):
total, used = check_mem(cuda_device)
total = int(total)
used = int(used)
max_mem = int(total * 0.9)
block_mem = max_mem - used
x = torch.cuda.FloatTensor(256,1024,block_mem)
del x
if __name__ == '__main__':
os.environ["CUDA_VISIBLE_DEVICES"] = cuda_device
occumpy_mem(cuda_device)
for _ in tqdm(range(60)):
time.sleep(1)
print('Done')
Thanks!
------------------ 原始邮件 ------------------ 发件人: "sparkydogX"; 发送时间: 2019年5月26日(星期天) 晚上6:36 收件人: "sparkydogX/sparkydogx_blog_comment"; 抄送: "sgKevin1"529707112@qq.com;"Mention"; 主题: Re: [sparkydogX/sparkydogx_blog_comment] 在使用Pytorch时提前分配显存 | SparkydogX Blog (#122)
@sgKevin1 代码呢?
如果无法正常显示代码,可以直接访问https://gist.github.com/sparkydogX/845b658e3e6cef58a7bf706a9f43d7bf 或者直接使用以下代码: import os import torch from tqdm import tqdm import time # declare which gpu device to use cuda_device = '0' def check_mem(cuda_device): devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n") total, used = devices_info[int(cuda_device)].split(',') return total,used def occumpy_mem(cuda_device): total, used = check_mem(cuda_device) total = int(total) used = int(used) max_mem = int(total * 0.9) block_mem = max_mem - used x = torch.cuda.FloatTensor(256,1024,block_mem) del x if name == 'main': os.environ["CUDA_VISIBLE_DEVICES"] = cuda_device occumpy_mem(cudadevice) for in tqdm(range(60)): time.sleep(1) print('Done')
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
这是我改的对应多卡版本:
import os
import torch
from tqdm import tqdm
import time
def check_mem(cuda_device):
devices_info = os.popen(
'"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split(
"\n")
total, used = devices_info[cuda_device].split(',')
return total, used
def occumpy_mem(cuda_device):
total, used = check_mem(cuda_device)
total = int(total)
used = int(used)
max_mem = int(total * 0.9)
block_mem = max_mem - used
# x = torch.cuda.FloatTensor(256, 1024, block_mem)
x = torch.FloatTensor(256, 1024, block_mem).cuda(cuda_device)
del x
if __name__ == '__main__':
import argparse
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--device_ids', help='device_ids', type=int, nargs="+",
default=list(range(torch.cuda.device_count())))
parser.add_argument('--time', help='occumpy time(s)', type=int, default=1000000)
args = parser.parse_args()
for cuda_device in args.device_ids:
occumpy_mem(cuda_device)
for _ in tqdm(range(args.time)):
time.sleep(1)
print('Done')
https://sparkydogx.github.io/2019/03/16/occupy-gpu-memory-in-advance/#more
Pytorch与Tensorflow在程序运行时的一个不同点是:tensorflow会在程序刚开始运行时就自动占掉所有可用显存;而pytorch会根据当前情况实时调整显存占用。在多人共用GPU训练神经网络的时候,往往会出现这样的情况:pytorch程序运行若干个epoch之后报错out of memory——也就是被人挤掉了。这篇文章介绍对付这种情况比较hack的一种方法。首先要说明这是 针对特殊