Closed liguge closed 4 days ago
我应该怎样在windows上进行测试运行呢?目前没有linux系统。python -m torch.distributed.launch --nproc_per_node=2 opengait/main.py --cfgs ./configs/baseline/baseline.yaml --phase train 如果将这个语句改为windows pytorch运行的形式? 我应该怎么修改代码? `python import os import torch.distributed as dist
# 设置环境变量 os.environ['MASTER_ADDR'] = 'localhost' os.environ['MASTER_PORT'] = '29565' torch.distributed.init_process_group('gloo', init_method='env://', rank=0, world_size=2) dist.destroy_process_group() if torch.distributed.get_world_size() != torch.cuda.device_count(): raise ValueError("Expect number of available GPUs({}) equals to the world size({}).".format( torch.cuda.device_count(), torch.distributed.get_world_size())) cfgs = config_loader(opt.cfgs) if opt.iter != 0: cfgs['evaluator_cfg']['restore_hint'] = int(opt.iter) cfgs['trainer_cfg']['restore_hint'] = int(opt.iter) training = (opt.phase == 'train') initialization(cfgs, training) run_model(cfgs, training)
`
你好,我把所有跟分布式有关的代码都删掉,就不报错了,希望可以帮助到你,rank=0,world_size=1
你好,我把所有跟分布式有关的代码都删掉,就不报错了,希望可以帮助到你,rank=0,world_size=1 感谢您的回复,请问dataset代码中,self.cache这个变量的意思是什么呢
System information (version)
Detailed description
我应该怎样在windows上进行测试运行呢?目前没有linux系统。python -m torch.distributed.launch --nproc_per_node=2 opengait/main.py --cfgs ./configs/baseline/baseline.yaml --phase train 如果将这个语句改为windows pytorch运行的形式? 我应该怎么修改代码? `python import os import torch.distributed as dist
`