Closed kiritigowda closed 3 years ago
@hansely changes to model compiler have to pass these tests - https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/tree/master/tests/neural_network_tests#mivisionx-neural-network-tests
@hansely any updates on this?
@kiritigowda The fuse-ops flow is working on my machine. Can you try installing numpy on python3?
pip3 install numpy
@hansely any updates on this?
@kiritigowda Here is the output of the script on my system
python runNeuralNetworkTests.py --profiler_mode 8
Model Name | Batch Size | Time/Batch (ms) | Time/Image (ms) |
---|---|---|---|
nnef-mnist | 1 | 0.185 | 0.185 |
nnef-mnist | 2 | 0.183 | 0.091 |
nnef-mnist | 4 | 0.186 | 0.046 |
nnef-mnist | 8 | 0.185 | 0.023 |
nnef-mnist | 16 | 0.184 | 0.011 |
nnef-mnist | 32 | 0.184 | 0.006 |
nnef-mnist | 64 | 0.182 | 0.003 |
python3 runNeuralNetworkTests.py --profiler_mode 8
Model Name | Batch Size | Time/Batch (ms) | Time/Image (ms) |
---|---|---|---|
nnef-mnist | 1 | 0.182 | 0.182 |
nnef-mnist | 2 | 0.182 | 0.091 |
nnef-mnist | 4 | 0.182 | 0.045 |
nnef-mnist | 8 | 0.186 | 0.023 |
nnef-mnist | 16 | 0.186 | 0.012 |
nnef-mnist | 32 | 0.188 | 0.006 |
nnef-mnist | 64 | 0.187 | 0.003 |
@kiritigowda Did you tried it on your system? Or was it on a docker?
@hansely this issue is no more observed on TOT.
CI LOG for reference- http://math-ci.rocm.amd.com/blue/organizations/jenkins/compute-rocm-dkms-no-npi-hipclang%2FShort-GPUOpen%2FMIVisionX/detail/master/55/pipeline
reading NNEF model from /var/jenkins_home/workspace/g_Short-GPUOpen_MIVisionX_master/LMZNbarS1/mivisionx/tests/neural_network_tests/models/nnef-mnist...
OK: creating IR description in ./graph.nnir ...
OK: creating IR binaries in ./binary ...
Done
reading IR model from . ...
OK: reading IR description from ./graph.nnir ...
OK: reading IR binaries from ./binary ...
writing IR model into . ...
OK: creating IR description in ./graph.nnir ...
OK: creating IR binaries in ./binary ...
reading IR model from . ...
OK: reading IR description from ./graph.nnir ...
OK: reading IR binaries from ./binary ...
#OUTPUT-TENSOR: output 3 10 1 1
creating C code in . ...
creating ./CMakeLists.txt ...
creating ./cmake/FindOpenCL.cmake ...
creating ./annmodule.h ...
creating ./annmodule.cpp ...
creating ./weights.bin ...
creating ./anntest.cpp ...
creating ./annpython.h ...
creating ./annpython.cpp ...
creating ./anntest.py ...
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OPENCL: /opt/rocm/opencl/lib/libOpenCL.so
-- Configuring done
-- Generating done
-- Build files have been written to: /var/jenkins_home/workspace/g_Short-GPUOpen_MIVisionX_master/LMZNbarS1/mivisionx/tests/neural_network_tests/models/develop/nnefFuse/nnir_build_64
Scanning dependencies of target annmodule
[ 16%] Building CXX object CMakeFiles/annmodule.dir/annmodule.cpp.o
[ 33%] Linking CXX shared library libannmodule.so
[ 33%] Built target annmodule
Scanning dependencies of target annpython
[ 50%] Building CXX object CMakeFiles/annpython.dir/annpython.cpp.o
[ 66%] Linking CXX shared library libannpython.so
[ 66%] Built target annpython
Scanning dependencies of target anntest
[ 83%] Building CXX object CMakeFiles/anntest.dir/anntest.cpp.o
[100%] Linking CXX executable anntest
[100%] Built target anntest
nnef-mnist - Batch size 64
OK: loaded 38 kernels from libvx_nn.so
OK: OpenVX using GPU device#0 (gfx908:sramecc+:xnack-) [OpenCL 2.0 ] [SvmCaps 0 0]
MIOpen(OpenCL): Warning [ParseAndLoadDb] File is unreadable: /opt/rocm/miopen/share/miopen/db/gfx90878.OpenCL.fdb.txt
OK: graph initialization with annAddToGraph() took 713.874 msec
OK: vxProcessGraph() took 101.052 msec (1st iteration)
OK: vxProcessGraph() took 0.175 msec (average over 100 iterations)
OK: OpenCL buffer usage: 127584, 10/10
OK: successful
Using the
-fuse
operation has the following warning