Closed fendiwira closed 5 years ago
Thanks for reporting the issue, @fendiwira. We'll take a look.
@parallelo / @sunway513 it seems quite a few recent issues raised are based on gfx803
ISA.
Yep, was just looking at that too. At least two others recent gfx803 mem fault issues, right?
https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/issues/282 https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/issues/300
@parallelo and mine #297 (the neglected one) :)
The issue has been identified a regression in ROCM2.0 user bits, only for Polaris; will keep posted here for further updates.
For future users who hit similar Memory access fault
errors, just wanted to mention the typical triage process for this type of error.
This error typically occurs with an out of bounds memory access on the GPU. The first step is to serialize all GPU kernels & copies, then dump out the kernel names that are launching.
export HCC_SERIALIZE_KERNEL=0x3
export HCC_SERIALIZE_COPY=0x3
export HIP_TRACE_API=0x2
[then re-run your application]
Often (but not always) the last printed kernel will be the one to further investigate -- it might point to a numerical library or something else that can potentially be triaged with a smaller test case.
More tips are listed here: https://rocm-documentation.readthedocs.io/en/latest/Other_Solutions/Other-Solutions.html
Thanks for the prompt response
Here I attach the last kernel print out:
<<hip-api pid:2395 tid:4.13829 2395 4.13829 hipLaunchKernel '_ZN10tensorflow14GatherOpKernelIfxLb1EEEvPKT_PKT0_PS1_xxxx' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302701896114
<<hip-api pid:2395 tid:5.11 2395 5.11 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702164059
<<hip-api pid:2395 tid:5.14 2395 5.14 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{13312,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702223749
<<hip-api pid:2395 tid:5.17 2395 5.17 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702277692
<<hip-api pid:2395 tid:5.20 2395 5.20 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{13312,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702317537
<<hip-api pid:2395 tid:4.13893 2395 4.13893 hipLaunchKernel '_ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIxLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_18TensorConversionOpIxKNS4_INS5_IKiLi1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0_' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302703556961
<<hip-api pid:2395 tid:4.14003 2395 4.14003 hipLaunchKernel '_ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2_' gridDim:{2867200,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302759947163
<<hip-api pid:2395 tid:4.14005 2395 4.14005 hipLaunchKernel '_ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2_' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302761067898
<<hip-api pid:2395 tid:2.31434 2395 2.31434 hipLaunchKernel 'miog_alphaab' gridDim:{43008,1,1} groupDim:{16,1,1} sharedMem:+0 stream:0.1 @302772118882
<<hip-api pid:2395 tid:2.31443 2395 2.31443 hipLaunchKernel 'miog_alphaab' gridDim:{43008,1,1} groupDim:{16,1,1} sharedMem:+0 stream:0.1 @302775748480
<<hip-api pid:2395 tid:2.31464 2395 2.31464 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{131072,2,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302779837406
<<hip-api pid:2395 tid:2.31479 2395 2.31479 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{131072,2,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302783683721
Memory access fault by GPU node-1 (Agent handle: 0x1bff4b0) on address 0x6ed57a000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)
Thanks, that's helpful. Next, would you be able to additionally run with the following:
export MIOPEN_ENABLE_LOGGING_CMD=1
Then, please send us the last section of the log.
OK, Here the result
Epoch 1/30
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 3 -H 1030 -W 1030 -k 64 -y 7 -x 7 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 3 -H 1030 -W 1030 -k 64 -y 7 -x 7 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenPoolingForward: ./bin/MIOpenDriver pool -n 1 -c 64 -H 512 -W 512 -y 3 -x 3 -p 0 -q 0 -u 2 -v 2 -m max -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 128 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 128 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenPoolingForward: ./bin/MIOpenDriver pool -n 1 -c 256 -H 32 -W 32 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -m max -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 16 -W 16 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 16 -W 16 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
2019-01-30 07:52:27.566019: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
2019-01-30 07:52:27.652974: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
Memory access fault by GPU node-1 (Agent handle: 0x3346900) on address 0x6dd90f000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)
gfx803
.Hi @fendiwira , can you try the following step and see if that can fix your issue:
cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb &&
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb &&
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache
Hi @sunway513 thanks, I'll try and revert to you
The most suspicious thing I've found is this:
BB0_10:
v_add_u32_e32 v19, vcc, 4, v15
v_add_u32_e32 v19, vcc, 16, v19
buffer_load_dword v19, v19, s[0:3], s11 offen
v_add_u32_e32 v15, vcc, 4, v15
The total allocated private size is 20 bytes, and this is accessing 20 bytes off the scratch wave offset. It's possible the base pointer here is negative, but as far as I can tell that isn't possible here
The most suspicious thing I've found is this:
BB0_10: v_add_u32_e32 v19, vcc, 4, v15 v_add_u32_e32 v19, vcc, 16, v19 buffer_load_dword v19, v19, s[0:3], s11 offen v_add_u32_e32 v15, vcc, 4, v15
The total allocated private size is 20 bytes, and this is accessing 20 bytes off the scratch wave offset. It's possible the base pointer here is negative, but as far as I can tell that isn't possible here
Nevermind, this only appears in my mangled version trying to find the fault point
I am getting this error on my RX580 too. I have pared down my code to isolate the problem:
import numpy as np
import tensorflow as tf
tf.enable_eager_execution()
print(tf.executing_eagerly())
model = tf.keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')
img = tf.random_uniform((1, 128, 128, 1), dtype=tf.float32)
img = tf.image.resize_images(img, [128, 256], align_corners=True, preserve_aspect_ratio=False)
print(img.shape)
with tf.GradientTape() as tape:
print(1)
img_hat = model(img)
print(2)
loss = tf.reduce_mean(img_hat)
print(3)
grads = tape.gradient(loss, model.variables)
print(4)
This fails with Memory access fault by GPU node-1 (Agent handle: 0x21eefd0) on address 0xadc658000. Reason: Page not present or supervisor privilege.
However, interestingly, when
img = tf.image.resize_images(img, [128, 256], align_corners=True, preserve_aspect_ratio=False)
is replaced with
img = tf.image.resize_images(img, [128, 128], align_corners=True, preserve_aspect_ratio=False)
it succeeds without problems.
EDIT: after downgrading to 1.2.0-2018111340 it works perfectly.
Hi @fendiwira , can you try the following step and see if that can fix your issue:
cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl && wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache
Hi @sunway513 it's works thank you..
@fendiwira thanks for the feedback! Will update when there's an official fix available.
Hi @fendiwira , can you try the following step and see if that can fix your issue:
cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl && wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache
Hi @sunway513 it's works thank you..
also works here I had similar problem while training model for object detection using faster rcnn inception v2 because but that downgrade it worked again
Same problem here on my RX480 when training a VGG16 network. Downgrading to an older release (1.2.0-2018111340) prevented the issue from showing up
I put @eukaryote31's test on gist for easier reproduction:
https://gist.github.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb
I am reproducing using ROCm 2.1 and Tensorflow 1.12:
$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/tensorflow:rocm2.1-tf1.12-python3
$ wget https://gist.githubusercontent.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb/raw/34e0426e10e665df0f66c298bb07f879bb2abe79/test.py
The test completes without error on CPU (Threadripper 1950X):
# env HIP_VISIBLE_DEVICES= python3 test.py
[...]
True
2019-05-14 11:40:41.038653: E tensorflow/stream_executor/rocm/rocm_driver.cc:965] could not retrieve ROCM device count: HIP_ERROR_NoDevice
(1, 128, 256, 1)
1
2
3
4
The test fails with the aforementioned Memory access fault
on GPU (gfx803, Fiji, Fury X):
# python3 test.py
[...]
True
2019-05-14 11:34:23.980338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 0 with properties:
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 1 with properties:
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:42:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 2 with properties:
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.05
pciBusID 0000:43:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Adding visible gpu devices: 0, 1, 2
2019-05-14 11:34:23.980750: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:34:23.980765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1057] 0 1 2
2019-05-14 11:34:23.980775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 0: N N N
2019-05-14 11:34:23.980784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 1: N N N
2019-05-14 11:34:23.980792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 2: N N N
2019-05-14 11:34:23.980860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3540 MB memory) -> physical GPU (device: 0, name: Device 7300, pci bus id: 0000:09:00.0)
2019-05-14 11:34:23.997726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3540 MB memory) -> physical GPU (device: 1, name: Device 7300, pci bus id: 0000:42:00.0)
2019-05-14 11:34:24.014538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3540 MB memory) -> physical GPU (device: 2, name: Device 7300, pci bus id: 0000:43:00.0)
(1, 128, 256, 1)
1
2
3
2019-05-14 11:34:28.159093: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-05-14 11:34:28.755163: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
Memory access fault by GPU node-2 (Agent handle: 0x2c5aee0) on address 0xbe1c00000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)
The downgrade suggested by @sunway513 works for me too:
# cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl && wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && dpkg -i rocm-opencl*.deb && rm -rf ~/.cache && cd -
# python3 test.py
[...]
True
2019-05-14 11:47:45.049593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 0 with properties:
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049735: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 1 with properties:
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:42:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 2 with properties:
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.05
pciBusID 0000:43:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Adding visible gpu devices: 0, 1, 2
2019-05-14 11:47:45.049944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:47:45.049956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1057] 0 1 2
2019-05-14 11:47:45.049966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 0: N N N
2019-05-14 11:47:45.049976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 1: N N N
2019-05-14 11:47:45.049987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 2: N N N
2019-05-14 11:47:45.050053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3540 MB memory) -> physical GPU (device: 0, name: Device 7300, pci bus id: 0000:09:00.0)
2019-05-14 11:47:45.066602: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3540 MB memory) -> physical GPU (device: 1, name: Device 7300, pci bus id: 0000:42:00.0)
2019-05-14 11:47:45.084008: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3540 MB memory) -> physical GPU (device: 2, name: Device 7300, pci bus id: 0000:43:00.0)
(1, 128, 256, 1)
1
2
3
2019-05-14 11:47:49.123642: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-05-14 11:47:49.890313: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
4
The issue persists and the downgrade still fixes it with today's rocm2.3-tf1.13-imagenet-training
.
This issue persists with rocm2.4-tf2.0-alpha0-config-v2
and the downgrade still fixes it.
I have this same issue using a R9 Fury card, following the installation guide https://rocm.github.io/tensorflow.html
The downgrade indeed fixed the issue.
A "true" fix would be preferable. Let me know if you need anything (config details, tests...).
Hi all, we have included a set of OpenCL toolchain fixes for GFX803 targets in ROCm2.5, in my local GFX803 setup with ROCm2.5 docker image, VM fault is no longer reproducible using the reduced test from @Bengt. Please try the following docker image: rocm/tensorflow:rocm2.5-tf1.13-python3
Hello @sunway513, I tried the new image on R9 Fury (non X) and am still getting this issue when running the following command:
python3 benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_gpus=1 --batch_size=4 --model=vgg16
BTW, I had to copy /opt/rocm/miopen/share/miopen/db/gfx803_64.cd.pdb.txt
to /opt/rocm/miopen/share/miopen/db/gfx803_56.cd.pdb.txt
in order to avoid annoying MIOpen(HIP): Warning [FindRecordUnsafe] File is unreadable:/opt/rocm/miopen/share/miopen/db/gfx803_56.cd.pdb.txt
messages.
Hi @gaetanbahl , VGG16 can run correctly on my local GFX803 setup using ROCm2.5 docker image.
Could you provide the logs for the following commands:
uname -a
apt --installed list | grep rock-dkms
Besides, it would be helpful if you can ensure the HIP unit tests can pass:
https://github.com/ROCm-Developer-Tools/HIP/tree/master/tests
For the concern on gfx803 MIOpen perfDB, MIOpen by default provides the following performance database:
gfx803_36.cd.pdb.txt gfx803_64.cd.pdb.txt gfx900_56.cd.pdb.txt gfx900_64.cd.pdb.txt gfx906_60.cd.pdb.txt gfx906_64.cd.pdb.txt
It seems your R9 Fury board spec is not on the list.
@daniellowell , could you comment on this issue?
I am using the docker image you mentionned.
root@epsilon:/dockerx# uname -a Linux epsilon 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ apt --installed list | grep rock-dkms WARNING: apt does not have a stable CLI interface. Use with caution in scripts. rock-dkms/now 2.4-25 all [installed,upgradable to: 2.5-27]
Oh, I guess I should upgrade rock-dkms, sorry... I will upgrade and try again.
@sunway513 Indeed, I don't get the memory error anymore, only the .txt thing.
Thanks for your help!
Can you confirm that simply copying the gfx803_64.cd.pdb.txt
file to gfx803_65.cd.pdb.txt
will not give me problems?
@gaetanbahl , thanks for the update :-) Copying the MIOpen performance database won't get you any functionality issue.
Can confirm that the crash doesn't occur anymore on my RX 480. Thank you for your hard work
Thank you @LithiumSR for confirming it!
I can confirm the test working under rocm2.5-tf1.13-python3
with R9 Fury X and Nano. Thanks for fixing!
Am not sure if to open a new issue because am having the same issue but with gfx900 (Vega 64). Sometimes it runs but over 70% of the time this error occurs. For my case installed rocm ubuntu 18.04 and compiled MIVisionX from source.
@urugn can you try the docker container: https://hub.docker.com/repository/docker/rocm/tensorflow
Same problem with miner on gfx900 (Vega FE) https://github.com/xmrig/xmrig/issues/1340
Same problem on Vega M GH, setting HCC_SERIALIZE_KERNEL=0x3 HCC_SERIALIZE_COPY=0x3 HIP_TRACE_API=0x2 MIOPEN_ENABLE_LOGGING_CMD=1
produces no further output.
I ported the test to TensorFlow 2:
wget https://gist.githubusercontent.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb/raw/c1ba1169aebdc980a144ac1672c6402235a470aa/test_tf2.py
It still works with image rocm/tensorflow:rocm3.0-tf2.1-rc1-python3
on 4 x Vega 64 8 GB Liquid Edition.
Same problem on Radeon VII running custom hip ported code distributed via ray. The code runs flawless without ray. On nvidia no problems with non-ported code and ray.
This problem is still exist when I use latest docker of rocm/tensorflow.I have been trying since yesterday.
Another Radeon VII with the same issue (on AI Benchmark):
MIOpen Error: /root/driver/MLOpen/src/gemm_v2.cpp:523: rocBlas error encountered Memory access fault by GPU node-1 (Agent handle: 0x5600f99cb850) on address 0x19000. Reason: Unknown.
ROCm: 3.5.0 TF Version: 2.2.0
Can somebody rehost the dropbox files in the fix that @sunway513 did. They are no longer availlable and I cannot issue the commands. Thanks!
I also tried to install AMDGPU-PRO but opencl wasn't available. I was able to install ROCm and OpenCL is now detected but I also have this error. My guess is even if the dropbox links above worked, the files might be outdated for the current version @spades1404
Yeah I eventually figured that out. Turns out 3.8 is broken(at least for me), and after many hours trying to configure a docker container with the "apparently" working 2.5 downgrade, I ran into more compatibility issues with python since it utilises python 3.5. if the apt-get hosted lower versions I could've just downgraded the version on my local machine. Anyways I've decided to just use colab now!
The OpenCL packages I posted last year can be found here: http://repo.radeon.com/rocm/apt/1.9.3/pool/main/r/rocm-opencl/rocm-opencl_1.2.0-2018111340_amd64.deb http://repo.radeon.com/rocm/apt/1.9.3/pool/main/r/rocm-opencl-dev/rocm-opencl-dev_1.2.0-2018111340_amd64.deb However, the newly reported issue should be different, and most likely would not benefit from the old OpenCL packages.
@Extarys @spades1404 Can you help create a new issue and provide the following information:
cc @jerryyin @deven-amd
Hello guys..
I am having issue to run rocm tensorflow with detail as follow:
System information
Describe the current behavior Epoch 1/30 2019-01-29 22:25:46.392668: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data 2019-01-29 22:25:46.446704: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter Memory access fault by GPU node-1 (Agent handle: 0x2e0dbf0) on address 0x6dccc0000. Reason: Page not present or supervisor privilege. Aborted (core dumped)
Describe the expected behavior Running normally until epoch 30/30
Code to reproduce the issue Keras Mask RCNN python3 platno.py train --dataset=/home/path/to/dataset --weights=coco Always getting error with core dumped as above message
Keras SSD python3 ssd300_training.py can run normally when lowering batch size from 32 to 8
python3 ssd7_training.py getting core dumped even lowering batch size to 1
Other info / logs Have tried to enable some env variable for debug but still get error: HSA_ENABLE_SDMA=0 HSA_ENABLE_INTERRUPT=0 HSA_SVM_GUARD_PAGES=0 HSA_DISABLE_CACHE=1
Please assist how to resolve this problem
Thanks and Regards