idekazuki / diary

記録
0 stars 1 forks source link

Apex 環境構築 #98

Open idekazuki opened 4 years ago

idekazuki commented 4 years ago

torch.version.cuda '10.1' CUDA_HOME '/usr/local/cuda-8.0' import subprocess subprocess.check_output([CUDA_HOME+'/bin/nvcc','-V'], universal_newlines=True) 'nvcc: NVIDIA (R) Cuda compiler driver\nCopyright (c) 2005-2016 NVIDIA Corporation\nBuilt on Sun_Sep__4_22:14:01_CDT_2016\nCuda compilation tools, release 8.0, V8.0.44\n'

idekazuki commented 4 years ago

https://www.ibm.com/support/knowledgecenter/SS5SF7_1.6.1/navigation/wmlce_getstarted_apex.html

上記のサイトに従って、miniconda 上で環境構築を行った。 尚、予め export CUDA_HOME="/usr/local/cuda-9.0" をbashrcに記述し、cudaのversion をAPEX対応の9 overにあげておいた。

idekazuki commented 4 years ago

conda install apex ではなく、conda install nvidia-apexで設定できた。

テストとしてdcgan.の訓練を行った。 -opt_level O1 [24/25][780/782] Loss_D: 0.0161 Loss_G: 5.2606 D(x): 6.3242 D(G(z)): -4.8828 / -5.2500 [24/25][781/782] Loss_D: 0.0085 Loss_G: 7.6221 D(x): 5.2227 D(G(z)): -7.6641 / -7.6211 170500096it [17:06, 166149.96it/s]

--opt_level O0

idekazuki commented 4 years ago

python dcgan.py --batch_size 256 --ngpu 4 Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",)

idekazuki commented 4 years ago

O1 06:06.48

idekazuki commented 4 years ago

O0 06:32.65

idekazuki commented 4 years ago
start from the beginning
Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Traceback (most recent call last):
  File "apex_run.py", line 150, in <module>
    net, optimizer = amp.initialize(net, optimizer, opt_level='O1')
  File "/home/yanai-lab/ide-k/ide-k/pyenv/apex/lib/python3.6/site-packages/apex/amp/frontend.py", line 358, in initialize
    return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs)
  File "/home/yanai-lab/ide-k/ide-k/pyenv/apex/lib/python3.6/site-packages/apex/amp/_initialize.py", line 167, in _initialize
    check_models(models)
  File "/home/yanai-lab/ide-k/ide-k/pyenv/apex/lib/python3.6/site-packages/apex/amp/_initialize.py", line 74, in check_models
    "Parallel wrappers should only be applied to the model(s) AFTER \n"
RuntimeError: Incoming model is an instance of torch.nn.parallel.DataParallel. Parallel wrappers should only be applied to the model(s) AFTER 
the model(s) have been returned from amp.initialize.
idekazuki commented 4 years ago

O2:time:412.0059335231781 O1:time:479.77091670036316