ROCm / tensorflow-upstream

TensorFlow ROCm port
https://tensorflow.org
Apache License 2.0
688 stars 95 forks source link

Memory access fault by GPU node-1 (Agent handle: 0x2e0dbf0) on address 0x6dccc0000. Reason: Page not present or supervisor privilege. #302

Closed fendiwira closed 5 years ago

fendiwira commented 5 years ago

Hello guys..

I am having issue to run rocm tensorflow with detail as follow:

System information

Describe the current behavior Epoch 1/30 2019-01-29 22:25:46.392668: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data 2019-01-29 22:25:46.446704: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter Memory access fault by GPU node-1 (Agent handle: 0x2e0dbf0) on address 0x6dccc0000. Reason: Page not present or supervisor privilege. Aborted (core dumped)

Describe the expected behavior Running normally until epoch 30/30

Code to reproduce the issue Keras Mask RCNN python3 platno.py train --dataset=/home/path/to/dataset --weights=coco Always getting error with core dumped as above message

Keras SSD python3 ssd300_training.py can run normally when lowering batch size from 32 to 8

python3 ssd7_training.py getting core dumped even lowering batch size to 1

Other info / logs Have tried to enable some env variable for debug but still get error: HSA_ENABLE_SDMA=0 HSA_ENABLE_INTERRUPT=0 HSA_SVM_GUARD_PAGES=0 HSA_DISABLE_CACHE=1

Please assist how to resolve this problem

Thanks and Regards

parallelo commented 5 years ago

Thanks for reporting the issue, @fendiwira. We'll take a look.

whchung commented 5 years ago

@parallelo / @sunway513 it seems quite a few recent issues raised are based on gfx803 ISA.

parallelo commented 5 years ago

Yep, was just looking at that too. At least two others recent gfx803 mem fault issues, right?

https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/issues/282 https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/issues/300

witeko commented 5 years ago

@parallelo and mine #297 (the neglected one) :)

sunway513 commented 5 years ago

The issue has been identified a regression in ROCM2.0 user bits, only for Polaris; will keep posted here for further updates.

parallelo commented 5 years ago

For future users who hit similar Memory access fault errors, just wanted to mention the typical triage process for this type of error.

This error typically occurs with an out of bounds memory access on the GPU. The first step is to serialize all GPU kernels & copies, then dump out the kernel names that are launching.

export HCC_SERIALIZE_KERNEL=0x3
export HCC_SERIALIZE_COPY=0x3
export HIP_TRACE_API=0x2
[then re-run your application]

Often (but not always) the last printed kernel will be the one to further investigate -- it might point to a numerical library or something else that can potentially be triaged with a smaller test case.

More tips are listed here: https://rocm-documentation.readthedocs.io/en/latest/Other_Solutions/Other-Solutions.html

fendiwira commented 5 years ago

Thanks for the prompt response

Here I attach the last kernel print out:

<<hip-api pid:2395 tid:4.13829 2395 4.13829 hipLaunchKernel '_ZN10tensorflow14GatherOpKernelIfxLb1EEEvPKT_PKT0_PS1_xxxx' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302701896114
<<hip-api pid:2395 tid:5.11 2395 5.11 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702164059
<<hip-api pid:2395 tid:5.14 2395 5.14 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{13312,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702223749
<<hip-api pid:2395 tid:5.17 2395 5.17 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702277692
<<hip-api pid:2395 tid:5.20 2395 5.20 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{13312,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702317537
<<hip-api pid:2395 tid:4.13893 2395 4.13893 hipLaunchKernel '_ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIxLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_18TensorConversionOpIxKNS4_INS5_IKiLi1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0_' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302703556961
<<hip-api pid:2395 tid:4.14003 2395 4.14003 hipLaunchKernel '_ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2_' gridDim:{2867200,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302759947163
<<hip-api pid:2395 tid:4.14005 2395 4.14005 hipLaunchKernel '_ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2_' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302761067898
<<hip-api pid:2395 tid:2.31434 2395 2.31434 hipLaunchKernel 'miog_alphaab' gridDim:{43008,1,1} groupDim:{16,1,1} sharedMem:+0 stream:0.1 @302772118882
<<hip-api pid:2395 tid:2.31443 2395 2.31443 hipLaunchKernel 'miog_alphaab' gridDim:{43008,1,1} groupDim:{16,1,1} sharedMem:+0 stream:0.1 @302775748480
<<hip-api pid:2395 tid:2.31464 2395 2.31464 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{131072,2,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302779837406
<<hip-api pid:2395 tid:2.31479 2395 2.31479 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{131072,2,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302783683721
Memory access fault by GPU node-1 (Agent handle: 0x1bff4b0) on address 0x6ed57a000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)
parallelo commented 5 years ago

Thanks, that's helpful. Next, would you be able to additionally run with the following:

export MIOPEN_ENABLE_LOGGING_CMD=1

Then, please send us the last section of the log.

fendiwira commented 5 years ago

OK, Here the result


Epoch 1/30
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 3 -H 1030 -W 1030 -k 64 -y 7 -x 7 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 3 -H 1030 -W 1030 -k 64 -y 7 -x 7 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenPoolingForward: ./bin/MIOpenDriver pool -n 1 -c 64 -H 512 -W 512 -y 3 -x 3 -p 0 -q 0 -u 2 -v 2 -m max -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 128 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 128 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenPoolingForward: ./bin/MIOpenDriver pool -n 1 -c 256 -H 32 -W 32 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -m max -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 16 -W 16 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 16 -W 16 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
2019-01-30 07:52:27.566019: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
2019-01-30 07:52:27.652974: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
Memory access fault by GPU node-1 (Agent handle: 0x3346900) on address 0x6dd90f000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)
whchung commented 5 years ago
sunway513 commented 5 years ago

Hi @fendiwira , can you try the following step and see if that can fix your issue:

cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache
fendiwira commented 5 years ago

Hi @sunway513 thanks, I'll try and revert to you

arsenm commented 5 years ago

The most suspicious thing I've found is this:

BB0_10:
       v_add_u32_e32 v19, vcc, 4, v15
       v_add_u32_e32 v19, vcc, 16, v19
       buffer_load_dword v19, v19, s[0:3], s11 offen
       v_add_u32_e32 v15, vcc, 4, v15

The total allocated private size is 20 bytes, and this is accessing 20 bytes off the scratch wave offset. It's possible the base pointer here is negative, but as far as I can tell that isn't possible here

arsenm commented 5 years ago

The most suspicious thing I've found is this:

BB0_10:
       v_add_u32_e32 v19, vcc, 4, v15
       v_add_u32_e32 v19, vcc, 16, v19
       buffer_load_dword v19, v19, s[0:3], s11 offen
       v_add_u32_e32 v15, vcc, 4, v15

The total allocated private size is 20 bytes, and this is accessing 20 bytes off the scratch wave offset. It's possible the base pointer here is negative, but as far as I can tell that isn't possible here

Nevermind, this only appears in my mangled version trying to find the fault point

yet-another-account commented 5 years ago

I am getting this error on my RX580 too. I have pared down my code to isolate the problem:

import numpy as np

import tensorflow as tf

tf.enable_eager_execution()
print(tf.executing_eagerly())

model = tf.keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')

img = tf.random_uniform((1, 128, 128, 1), dtype=tf.float32)
img = tf.image.resize_images(img, [128, 256], align_corners=True, preserve_aspect_ratio=False)

print(img.shape)
with tf.GradientTape() as tape:
    print(1)
    img_hat = model(img)
    print(2)
    loss = tf.reduce_mean(img_hat)
    print(3)
grads = tape.gradient(loss, model.variables)
print(4)

This fails with Memory access fault by GPU node-1 (Agent handle: 0x21eefd0) on address 0xadc658000. Reason: Page not present or supervisor privilege.

However, interestingly, when

img = tf.image.resize_images(img, [128, 256], align_corners=True, preserve_aspect_ratio=False)

is replaced with

img = tf.image.resize_images(img, [128, 128], align_corners=True, preserve_aspect_ratio=False)

it succeeds without problems.

EDIT: after downgrading to 1.2.0-2018111340 it works perfectly.

fendiwira commented 5 years ago

Hi @fendiwira , can you try the following step and see if that can fix your issue:

cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache

Hi @sunway513 it's works thank you..

sunway513 commented 5 years ago

@fendiwira thanks for the feedback! Will update when there's an official fix available.

johnneijzen commented 5 years ago

Hi @fendiwira , can you try the following step and see if that can fix your issue:

cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache

Hi @sunway513 it's works thank you..

also works here I had similar problem while training model for object detection using faster rcnn inception v2 because but that downgrade it worked again

leosarra commented 5 years ago

Same problem here on my RX480 when training a VGG16 network. Downgrading to an older release (1.2.0-2018111340) prevented the issue from showing up

Bengt commented 5 years ago

I put @eukaryote31's test on gist for easier reproduction:

https://gist.github.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb

I am reproducing using ROCm 2.1 and Tensorflow 1.12:

$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/tensorflow:rocm2.1-tf1.12-python3
$ wget https://gist.githubusercontent.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb/raw/34e0426e10e665df0f66c298bb07f879bb2abe79/test.py

The test completes without error on CPU (Threadripper 1950X):

# env HIP_VISIBLE_DEVICES= python3 test.py
[...]
True
2019-05-14 11:40:41.038653: E tensorflow/stream_executor/rocm/rocm_driver.cc:965] could not retrieve ROCM device count: HIP_ERROR_NoDevice
(1, 128, 256, 1)
1
2
3
4

The test fails with the aforementioned Memory access fault on GPU (gfx803, Fiji, Fury X):

# python3 test.py           
[...]
True
2019-05-14 11:34:23.980338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 0 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 1 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:42:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 2 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.05
pciBusID 0000:43:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Adding visible gpu devices: 0, 1, 2
2019-05-14 11:34:23.980750: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:34:23.980765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1057]      0 1 2 
2019-05-14 11:34:23.980775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 0:   N N N 
2019-05-14 11:34:23.980784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 1:   N N N 
2019-05-14 11:34:23.980792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 2:   N N N 
2019-05-14 11:34:23.980860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3540 MB memory) -> physical GPU (device: 0, name: Device 7300, pci bus id: 0000:09:00.0)
2019-05-14 11:34:23.997726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3540 MB memory) -> physical GPU (device: 1, name: Device 7300, pci bus id: 0000:42:00.0)
2019-05-14 11:34:24.014538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3540 MB memory) -> physical GPU (device: 2, name: Device 7300, pci bus id: 0000:43:00.0)
(1, 128, 256, 1)
1
2
3
2019-05-14 11:34:28.159093: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-05-14 11:34:28.755163: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
Memory access fault by GPU node-2 (Agent handle: 0x2c5aee0) on address 0xbe1c00000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)

The downgrade suggested by @sunway513 works for me too:

# cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl && wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb &&  wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && dpkg -i rocm-opencl*.deb && rm -rf ~/.cache && cd -
# python3 test.py 
[...]
True
2019-05-14 11:47:45.049593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 0 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049735: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 1 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:42:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 2 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.05
pciBusID 0000:43:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Adding visible gpu devices: 0, 1, 2
2019-05-14 11:47:45.049944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:47:45.049956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1057]      0 1 2 
2019-05-14 11:47:45.049966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 0:   N N N 
2019-05-14 11:47:45.049976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 1:   N N N 
2019-05-14 11:47:45.049987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 2:   N N N 
2019-05-14 11:47:45.050053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3540 MB memory) -> physical GPU (device: 0, name: Device 7300, pci bus id: 0000:09:00.0)
2019-05-14 11:47:45.066602: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3540 MB memory) -> physical GPU (device: 1, name: Device 7300, pci bus id: 0000:42:00.0)
2019-05-14 11:47:45.084008: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3540 MB memory) -> physical GPU (device: 2, name: Device 7300, pci bus id: 0000:43:00.0)
(1, 128, 256, 1)
1
2
3
2019-05-14 11:47:49.123642: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-05-14 11:47:49.890313: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
4
Bengt commented 5 years ago

The issue persists and the downgrade still fixes it with today's rocm2.3-tf1.13-imagenet-training.

Bengt commented 5 years ago

This issue persists with rocm2.4-tf2.0-alpha0-config-v2 and the downgrade still fixes it.

gaetanbahl commented 5 years ago

I have this same issue using a R9 Fury card, following the installation guide https://rocm.github.io/tensorflow.html

The downgrade indeed fixed the issue.

A "true" fix would be preferable. Let me know if you need anything (config details, tests...).

sunway513 commented 5 years ago

Hi all, we have included a set of OpenCL toolchain fixes for GFX803 targets in ROCm2.5, in my local GFX803 setup with ROCm2.5 docker image, VM fault is no longer reproducible using the reduced test from @Bengt. Please try the following docker image: rocm/tensorflow:rocm2.5-tf1.13-python3

gaetanbahl commented 5 years ago

Hello @sunway513, I tried the new image on R9 Fury (non X) and am still getting this issue when running the following command:

python3 benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_gpus=1 --batch_size=4 --model=vgg16

BTW, I had to copy /opt/rocm/miopen/share/miopen/db/gfx803_64.cd.pdb.txt to /opt/rocm/miopen/share/miopen/db/gfx803_56.cd.pdb.txt in order to avoid annoying MIOpen(HIP): Warning [FindRecordUnsafe] File is unreadable:/opt/rocm/miopen/share/miopen/db/gfx803_56.cd.pdb.txt messages.

sunway513 commented 5 years ago

Hi @gaetanbahl , VGG16 can run correctly on my local GFX803 setup using ROCm2.5 docker image. Could you provide the logs for the following commands: uname -a apt --installed list | grep rock-dkms Besides, it would be helpful if you can ensure the HIP unit tests can pass: https://github.com/ROCm-Developer-Tools/HIP/tree/master/tests

For the concern on gfx803 MIOpen perfDB, MIOpen by default provides the following performance database: gfx803_36.cd.pdb.txt gfx803_64.cd.pdb.txt gfx900_56.cd.pdb.txt gfx900_64.cd.pdb.txt gfx906_60.cd.pdb.txt gfx906_64.cd.pdb.txt It seems your R9 Fury board spec is not on the list. @daniellowell , could you comment on this issue?

gaetanbahl commented 5 years ago

I am using the docker image you mentionned.

root@epsilon:/dockerx# uname -a Linux epsilon 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

$ apt --installed list | grep rock-dkms WARNING: apt does not have a stable CLI interface. Use with caution in scripts. rock-dkms/now 2.4-25 all [installed,upgradable to: 2.5-27]

Oh, I guess I should upgrade rock-dkms, sorry... I will upgrade and try again.

gaetanbahl commented 5 years ago

@sunway513 Indeed, I don't get the memory error anymore, only the .txt thing.

Thanks for your help!

Can you confirm that simply copying the gfx803_64.cd.pdb.txt file to gfx803_65.cd.pdb.txt will not give me problems?

sunway513 commented 5 years ago

@gaetanbahl , thanks for the update :-) Copying the MIOpen performance database won't get you any functionality issue.

leosarra commented 5 years ago

Can confirm that the crash doesn't occur anymore on my RX 480. Thank you for your hard work

sunway513 commented 5 years ago

Thank you @LithiumSR for confirming it!

Bengt commented 5 years ago

I can confirm the test working under rocm2.5-tf1.13-python3 with R9 Fury X and Nano. Thanks for fixing!

urugn commented 5 years ago

Am not sure if to open a new issue because am having the same issue but with gfx900 (Vega 64). Sometimes it runs but over 70% of the time this error occurs. For my case installed rocm ubuntu 18.04 and compiled MIVisionX from source.

sunway513 commented 5 years ago

@urugn can you try the docker container: https://hub.docker.com/repository/docker/rocm/tensorflow

minzak commented 4 years ago

Same problem with miner on gfx900 (Vega FE) https://github.com/xmrig/xmrig/issues/1340

ranisalt commented 4 years ago

Same problem on Vega M GH, setting HCC_SERIALIZE_KERNEL=0x3 HCC_SERIALIZE_COPY=0x3 HIP_TRACE_API=0x2 MIOPEN_ENABLE_LOGGING_CMD=1 produces no further output.

Bengt commented 4 years ago

I ported the test to TensorFlow 2:

wget https://gist.githubusercontent.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb/raw/c1ba1169aebdc980a144ac1672c6402235a470aa/test_tf2.py

It still works with image rocm/tensorflow:rocm3.0-tf2.1-rc1-python3 on 4 x Vega 64 8 GB Liquid Edition.

Dan-RAI commented 4 years ago

Same problem on Radeon VII running custom hip ported code distributed via ray. The code runs flawless without ray. On nvidia no problems with non-ported code and ray.

FormulasT commented 4 years ago

This problem is still exist when I use latest docker of rocm/tensorflow.I have been trying since yesterday.

Soddentrough commented 4 years ago

Another Radeon VII with the same issue (on AI Benchmark):

MIOpen Error: /root/driver/MLOpen/src/gemm_v2.cpp:523: rocBlas error encountered Memory access fault by GPU node-1 (Agent handle: 0x5600f99cb850) on address 0x19000. Reason: Unknown.

ROCm: 3.5.0 TF Version: 2.2.0

spades1404 commented 4 years ago

Can somebody rehost the dropbox files in the fix that @sunway513 did. They are no longer availlable and I cannot issue the commands. Thanks!

Extarys commented 4 years ago

I also tried to install AMDGPU-PRO but opencl wasn't available. I was able to install ROCm and OpenCL is now detected but I also have this error. My guess is even if the dropbox links above worked, the files might be outdated for the current version @spades1404

spades1404 commented 4 years ago

Yeah I eventually figured that out. Turns out 3.8 is broken(at least for me), and after many hours trying to configure a docker container with the "apparently" working 2.5 downgrade, I ran into more compatibility issues with python since it utilises python 3.5. if the apt-get hosted lower versions I could've just downgraded the version on my local machine. Anyways I've decided to just use colab now!

sunway513 commented 4 years ago

The OpenCL packages I posted last year can be found here: http://repo.radeon.com/rocm/apt/1.9.3/pool/main/r/rocm-opencl/rocm-opencl_1.2.0-2018111340_amd64.deb http://repo.radeon.com/rocm/apt/1.9.3/pool/main/r/rocm-opencl-dev/rocm-opencl-dev_1.2.0-2018111340_amd64.deb However, the newly reported issue should be different, and most likely would not benefit from the old OpenCL packages.

@Extarys @spades1404 Can you help create a new issue and provide the following information:

cc @jerryyin @deven-amd