No performance improvement with NNFusion on ResNet50

ENV:

branch: osdi20_artifact
model: Resnet50
gpu: A100-SXM4-40GB
cuda: 11.0
cuDNN 8.0.2
Tensorflow: nvidia_tensorflow-1.15.4

Steps to reproduce the behavior:

Generate the op_configs file of resnet50: resnet50_const_conv_kernels.txt
Run AutoTVM according to the op_configs file of resnet50 to generate the corresponding log and json files of resnet50
Run insert_db.sh to generate a new kernel_cache.db
Run resnet50:
1. Command: nnfusion resnet50_v1.const_folded.pb -f tensorflow -b nnfusion -m graph -fkernel_fusion_level=3 -fblockfusion_level=1 -fconst_folding_backend=CUDA -fwarmup_step=5 -frun_step=1000 -fkernels_as_files=true -fkernels_as_files=true -fkernels_as_files=true -fkernels=60 -SXM4-40GB" -fbiasadd_fix=true -fpattern_substitution=true
2. We get infer time: Summary: [min, max, mean] = [3.184192, 8.624928, 3.312613] ms
In contrast, we use the native kernel.db to compile ResNet50:
1. Command: nnfusion resnet50_v1.const_folded.pb -f tensorflow -b nnfusion -m graph -fkernel_fusion_level=3 -fblockfusion_level=1 -fconst_folding_backend=CUDA -fwarmup_step=5 -frun_step=1000 -fkernels_as_files=true -fkernels_files_number=60 -fproduct_name="Tesla V100-PCIE-16GB" -fbiasadd_fix=true -fpattern_substitution=true.
2. Update CMakeList.txt: add -gencode arch=compute_80,code=sm_80 to support running on Ampere architecture.
3. We get Infer time: Summary: [min, max, mean] = [2.643968, 6.354400, 2.747237] ms

Expected behavior

When running with native kernel.db, we found in the log that several conv2d kernels were skipped in the blockfusion pass, while using the newly generated kernel.db we did not find these abnormal logs. We believe that nnfusion does not fully perform blockfusion scheduling in the native environment, and the new kernel.db provides a more complete environment for blockfusion. Therefore, performance should be improved rather than reduced. Could anyone help me explain this result? Thanks!

microsoft / nnfusion

No performance improvement with NNFusion on ResNet50 #327