mlcommons / inference_results_v0.7

This repository contains the results and code for the MLPerf™ Inference v0.7 benchmark.
https://mlcommons.org/en/inference-datacenter-07/
Apache License 2.0
17 stars 28 forks source link

Reproducing NVIDIA's Xavier submission with JetPack 4.5 #15

Closed psyhtest closed 3 years ago

psyhtest commented 3 years ago

JetPack 4.5 is the latest production release to which one can upgrade without reflashing the board with 20.09 Jetson CUDA-X AI Developer Preview that NVIDIA used for their v0.7 submission. Although released several (4-5?) months after the v0.7 submission, it still lags behind in terms of the TensorRT version: v7.1.3 vs v7.2.0. This may explain the issues I'm seeing when trying to build the test harness:

anton@xavier:/datasets/inference_results_v0.7/closed/NVIDIA$ make build                                                                                                                                    
Cloning Official MLPerf Inference (For Loadgen Files)                                                                          
Cloning into '/datasets/inference_results_v0.7/closed/NVIDIA/build/inference'...
<...>
-- Configuring done           
-- Generating done                                          
-- Build files have been written to: /datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin
make[2]: Entering directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
make[3]: Entering directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
make[4]: Entering directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
Scanning dependencies of target instancenorm3dplugin
make[4]: Leaving directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
make[4]: Entering directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
[ 25%] Building CUDA object CMakeFiles/instancenorm3dplugin.dir/src/instanceNormalization3DPlugin.cu.o                  
[ 50%] Building CUDA object CMakeFiles/instancenorm3dplugin.dir/src/instance_norm_fwd_impl.cu.o
/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(249): error: enum "nvinfer1::TensorFormat" has no member "kDHWC8"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(250): error: enum "nvinfer1::TensorFormat" has no member "kCDHW32"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(358): error: enum "nvinfer1::TensorFormat" has no member "kDHWC8"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(359): error: enum "nvinfer1::TensorFormat" has no member "kCDHW32"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(455): error: enum "nvinfer1::TensorFormat" has no member "kDHWC8"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(460): error: enum "nvinfer1::TensorFormat" has no member "kCDHW32"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(465): error: enum "nvinfer1::TensorFormat" has no member "kCDHW32"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(466): error: enum "nvinfer1::TensorFormat" has no member "kCDHW32"

/datasets/inference_results_v0.7/closed/NVIDIA/code/plugin/instanceNormalization3DPlugin/src/instanceNormalization3DPlugin.cu(468): error: enum "nvinfer1::TensorFormat" has no member "kCDHW32"

9 errors detected in the compilation of "/tmp/tmpxft_0000318e_00000000-8_instanceNormalization3DPlugin.compute_75.cpp1.ii".
CMakeFiles/instancenorm3dplugin.dir/build.make:82: recipe for target 'CMakeFiles/instancenorm3dplugin.dir/src/instanceNormalization3DPlugin.cu.o' failed
make[4]: *** [CMakeFiles/instancenorm3dplugin.dir/src/instanceNormalization3DPlugin.cu.o] Error 1
make[4]: *** Waiting for unfinished jobs....
make[4]: Leaving directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
CMakeFiles/Makefile2:95: recipe for target 'CMakeFiles/instancenorm3dplugin.dir/all' failed
make[3]: *** [CMakeFiles/instancenorm3dplugin.dir/all] Error 2
make[3]: Leaving directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
Makefile:103: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/datasets/inference_results_v0.7/closed/NVIDIA/build/plugins/instanceNormalization3DPlugin'
Makefile:305: recipe for target 'build_plugins' failed
make[1]: *** [build_plugins] Error 2
make[1]: Leaving directory '/datasets/inference_results_v0.7/closed/NVIDIA'
Makefile:247: recipe for target 'build' failed
make: *** [build] Error 2

@nvpohanh Can you perhaps suggest a workaround? instanceNormalization3DPlugin sounds like something related to 3D U-net, which I don't need at the moment. Perhaps I can bypass building this plugin?

psyhtest commented 3 years ago

I managed to build without errors by disabling instanceNormalization3DPlugin and pixelShuffle3DPlugin:

anton@xavier:/datasets/inference_results_v0.7/closed/NVIDIA$ git diff Makefile
diff --git a/closed/NVIDIA/Makefile b/closed/NVIDIA/Makefile
index 4679ec29..3961fb96 100644
--- a/closed/NVIDIA/Makefile
+++ b/closed/NVIDIA/Makefile
@@ -320,6 +320,7 @@ endif
        cd build/plugins/RNNTOptPlugin \
                && cmake -DCMAKE_BUILD_TYPE=$(BUILD_TYPE) $(PROJECT_ROOT)/code/plugin/RNNTOptPlugin \
                && make -j
+ifeq ($(ARCH), x86_64)
        mkdir -p build/plugins/instanceNormalization3DPlugin
        cd build/plugins/instanceNormalization3DPlugin \
                && cmake -DCMAKE_BUILD_TYPE=$(BUILD_TYPE) $(PROJECT_ROOT)/code/plugin/instanceNormalization3DPlugin \
@@ -328,6 +329,7 @@ endif
        cd build/plugins/pixelShuffle3DPlugin \
                && cmake -DCMAKE_BUILD_TYPE=$(BUILD_TYPE) $(PROJECT_ROOT)/code/plugin/pixelShuffle3DPlugin \
                && make -j
+endif

 # Build LoadGen.
 .PHONY: build_loadgen

and removing two (new?) tensor format cases:

anton@xavier:/datasets/inference_results_v0.7/closed/NVIDIA$ git diff code/harness/lwis/include/lwis_buffers.h
git diff code/harness/lwis/include/lwis_buffers.h
diff --git a/closed/NVIDIA/code/harness/lwis/include/lwis_buffers.h b/closed/NVIDIA/code/harness/lwis/include/lwis_buffers.h
index 5a79260c..59f8c120 100644
--- a/closed/NVIDIA/code/harness/lwis/include/lwis_buffers.h
+++ b/closed/NVIDIA/code/harness/lwis/include/lwis_buffers.h
@@ -80,10 +80,8 @@ inline int64_t volume(const nvinfer1::Dims& d, const nvinfer1::TensorFormat& for
         case nvinfer1::TensorFormat::kCHW2: spv = 2; channelDim = d_new.nbDims - 3; break;
         case nvinfer1::TensorFormat::kCHW4: spv = 4; channelDim = d_new.nbDims - 3; break;
         case nvinfer1::TensorFormat::kHWC8: spv = 8; channelDim = d_new.nbDims - 3; break;
-        case nvinfer1::TensorFormat::kDHWC8: spv = 8; channelDim = d_new.nbDims - 4; break;
         case nvinfer1::TensorFormat::kCHW16: spv = 16; channelDim = d_new.nbDims - 3; break;
         case nvinfer1::TensorFormat::kCHW32: spv = 32; channelDim = d_new.nbDims - 3; break;
-        case nvinfer1::TensorFormat::kCDHW32: spv = 32; channelDim = d_new.nbDims - 4; break;
         case nvinfer1::TensorFormat::kLINEAR:
         default: spv = 1; channelDim = -1; break;
     }
nvpohanh commented 3 years ago

@psyhtest Our submission codes don't support TRT 7.1 at all. TRT 7.2.0 is the minimal requirement, but I know that TRT 7.2.0 probably doesn't work with JP4.5. Could you try TRT 7.2.2 instead? https://docs.nvidia.com/deeplearning/tensorrt/release-notes/tensorrt-7.html#rel_7-2-2

psyhtest commented 3 years ago

@nvpohanh I don't think one can upgrade TensorRT on Jetson boards that easily? It seems I either needs to install the Developer Preview release, or wait until JetPack 4.6 or even later.

Actually, because of using a Developer Preview release, should not the Xavier v0.7 submission have been categorised as "Preview"? According to the MLPerf Inference rules:

If you are measuring the performance of a publicly available and widely-used system or framework, you must use publicly available and widely-used versions of the system or framework.

While Xavier is publicly available and widely-used, a Developer Preview may be publicly available but is not widely-used.

nvpohanh commented 3 years ago

@DilipSequeira Could you provide more detailed explanation?

We used 20.09 Jetson CUDA-X AI for our Xavier v0.7 submissions. As the descriptions suggest, it is for whoever is interested in trying latest CUDA-X AI components and is available to everyone to deploy on their systems.

As for v1.0, if you are interested in using a newer software stack than the one we used in v0.7 for v1.0 Xavier submission, could you send an email to me and Dilip so that we can discuss there?

psyhtest commented 3 years ago

@nvpohanh Sure, will do, thanks.

DilipSequeira commented 3 years ago

Hi Anton

The specific definition of Available for the purpose of submission categories is set out in section 7.3.1 of the general policies document.

For binaries, the binary must be made available as release, or as a "beta" release with the requirement that optimizations will be included in a future "official" release.

There is some inconsistency between this section and the statement you point out in the inference rules.

psyhtest commented 3 years ago

The vision benchmarks seem to work with TensorRT v7.1.3, so closing.