sparkydogX / sparkydogx_blog_comment

Comments for https://sparkydogx.github.io
1 stars 0 forks source link

在使用Pytorch时提前分配显存 | SparkydogX Blog #122

Open sparkydogX opened 5 years ago

sparkydogX commented 5 years ago

https://sparkydogx.github.io/2019/03/16/occupy-gpu-memory-in-advance/#more

Pytorch与Tensorflow在程序运行时的一个不同点是:tensorflow会在程序刚开始运行时就自动占掉所有可用显存;而pytorch会根据当前情况实时调整显存占用。在多人共用GPU训练神经网络的时候,往往会出现这样的情况:pytorch程序运行若干个epoch之后报错out of memory——也就是被人挤掉了。这篇文章介绍对付这种情况比较hack的一种方法。首先要说明这是 针对特殊

sparkydogX commented 5 years ago

当显存非常大的时候,初始化变量将会消耗大量的时间。可以考虑将变量分成若干小块,以减少意外情况发生的几率

sgKevin1 commented 5 years ago

代码呢?

sparkydogX commented 5 years ago

@sgKevin1 代码呢?

如果无法正常显示代码,可以直接访问https://gist.github.com/sparkydogX/845b658e3e6cef58a7bf706a9f43d7bf 或者直接使用以下代码:

import os
import torch
from tqdm import tqdm
import time

# declare which gpu device to use
cuda_device = '0'

def check_mem(cuda_device):
    devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n")
    total, used = devices_info[int(cuda_device)].split(',')
    return total,used

def occumpy_mem(cuda_device):
    total, used = check_mem(cuda_device)
    total = int(total)
    used = int(used)
    max_mem = int(total * 0.9)
    block_mem = max_mem - used
    x = torch.cuda.FloatTensor(256,1024,block_mem)
    del x

if __name__ == '__main__':
    os.environ["CUDA_VISIBLE_DEVICES"] = cuda_device
    occumpy_mem(cuda_device)
    for _ in tqdm(range(60)):
        time.sleep(1)
print('Done')
sgKevin1 commented 5 years ago

Thanks!

------------------ 原始邮件 ------------------ 发件人: "sparkydogX"; 发送时间: 2019年5月26日(星期天) 晚上6:36 收件人: "sparkydogX/sparkydogx_blog_comment"; 抄送: "sgKevin1"529707112@qq.com;"Mention"; 主题: Re: [sparkydogX/sparkydogx_blog_comment] 在使用Pytorch时提前分配显存 | SparkydogX Blog (#122)

@sgKevin1 代码呢?

如果无法正常显示代码,可以直接访问https://gist.github.com/sparkydogX/845b658e3e6cef58a7bf706a9f43d7bf 或者直接使用以下代码: import os import torch from tqdm import tqdm import time # declare which gpu device to use cuda_device = '0' def check_mem(cuda_device): devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n") total, used = devices_info[int(cuda_device)].split(',') return total,used def occumpy_mem(cuda_device): total, used = check_mem(cuda_device) total = int(total) used = int(used) max_mem = int(total * 0.9) block_mem = max_mem - used x = torch.cuda.FloatTensor(256,1024,block_mem) del x if name == 'main': os.environ["CUDA_VISIBLE_DEVICES"] = cuda_device occumpy_mem(cudadevice) for in tqdm(range(60)): time.sleep(1) print('Done')

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Hzzone commented 4 years ago

这是我改的对应多卡版本:

import os
import torch
from tqdm import tqdm
import time

def check_mem(cuda_device):
    devices_info = os.popen(
        '"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split(
        "\n")
    total, used = devices_info[cuda_device].split(',')
    return total, used

def occumpy_mem(cuda_device):
    total, used = check_mem(cuda_device)
    total = int(total)
    used = int(used)
    max_mem = int(total * 0.9)
    block_mem = max_mem - used
    # x = torch.cuda.FloatTensor(256, 1024, block_mem)
    x = torch.FloatTensor(256, 1024, block_mem).cuda(cuda_device)
    del x

if __name__ == '__main__':
    import argparse
    import argparse

    parser = argparse.ArgumentParser()
    parser.add_argument('--device_ids', help='device_ids', type=int, nargs="+",
                        default=list(range(torch.cuda.device_count())))
    parser.add_argument('--time', help='occumpy time(s)', type=int, default=1000000)
    args = parser.parse_args()
    for cuda_device in args.device_ids:
        occumpy_mem(cuda_device)
    for _ in tqdm(range(args.time)):
        time.sleep(1)
    print('Done')