Open fumao13579 opened 1 year ago
Thanks for report.
You can not get result for other tflite models because it is hang on device side. Could you share an example model which cannot get result?
@sunshinemyson https://drive.google.com/drive/folders/1IYTqiebzVf_M0oUAjzG7BJ__iiC1P-Qm?usp=share_link
When these three TFLite models are executed with TVM RPC, they produce empty output. Similarly, when I copy the model_lib.so generated by tvm's "lib.export_library(lib_path)" to Khadas VIM3 Pro and run the TVM runtime, I still encounter empty output. However, when I perform local compilation of these three TFLite models using TVM on Khadas VIM3 Pro, I can obtain normal output.
This is the output of local compilation on Khadas VIM3 Pro, using the quantized tflite model InceptionNetV1.
``` khadas@Khadas:~/$ python3 test_vsi_tflite_model_all.py #[version = "0.0.5"] def @main(%input: Tensor[(1, 224, 224, 3), uint8], %v_param_1: Tensor[(7, 7, 3, 64), uint8], %v_param_2: Tensor[(64), int32], %v_param_3: Tensor[(1, 1, 64, 64), uint8], %v_param_4: Tensor[(64), int32], %v_param_5: Tensor[(3, 3, 64, 192), uint8], %v_param_6: Tensor[(192), int32], %v_param_7: Tensor[(1, 1, 192, 64), uint8], %v_param_8: Tensor[(64), int32], %v_param_9: Tensor[(1, 1, 192, 96), uint8], %v_param_10: Tensor[(96), int32], %v_param_11: Tensor[(3, 3, 96, 128), uint8], %v_param_12: Tensor[(128), int32], %v_param_13: Tensor[(1, 1, 192, 16), uint8], %v_param_14: Tensor[(16), int32], %v_param_15: Tensor[(3, 3, 16, 32), uint8], %v_param_16: Tensor[(32), int32], %v_param_17: Tensor[(1, 1, 192, 32), uint8], %v_param_18: Tensor[(32), int32], %v_param_19: Tensor[(1, 1, 256, 128), uint8], %v_param_20: Tensor[(128), int32], %v_param_21: Tensor[(1, 1, 256, 128), uint8], %v_param_22: Tensor[(128), int32], %v_param_23: Tensor[(3, 3, 128, 192), uint8], %v_param_24: Tensor[(192), int32], %v_param_25: Tensor[(1, 1, 256, 32), uint8], %v_param_26: Tensor[(32), int32], %v_param_27: Tensor[(3, 3, 32, 96), uint8], %v_param_28: Tensor[(96), int32], %v_param_29: Tensor[(1, 1, 256, 64), uint8], %v_param_30: Tensor[(64), int32], %v_param_31: Tensor[(1, 1, 480, 192), uint8], %v_param_32: Tensor[(192), int32], %v_param_33: Tensor[(1, 1, 480, 96), uint8], %v_param_34: Tensor[(96), int32], %v_param_35: Tensor[(3, 3, 96, 208), uint8], %v_param_36: Tensor[(208), int32], %v_param_37: Tensor[(1, 1, 480, 16), uint8], %v_param_38: Tensor[(16), int32], %v_param_39: Tensor[(3, 3, 16, 48), uint8], %v_param_40: Tensor[(48), int32], %v_param_41: Tensor[(1, 1, 480, 64), uint8], %v_param_42: Tensor[(64), int32], %v_param_43: Tensor[(1, 1, 512, 160), uint8], %v_param_44: Tensor[(160), int32], %v_param_45: Tensor[(1, 1, 512, 112), uint8], %v_param_46: Tensor[(112), int32], %v_param_47: Tensor[(3, 3, 112, 224), uint8], %v_param_48: Tensor[(224), int32], %v_param_49: Tensor[(1, 1, 512, 24), uint8], %v_param_50: Tensor[(24), int32], %v_param_51: Tensor[(3, 3, 24, 64), uint8], %v_param_52: Tensor[(64), int32], %v_param_53: Tensor[(1, 1, 512, 64), uint8], %v_param_54: Tensor[(64), int32], %v_param_55: Tensor[(1, 1, 512, 128), uint8], %v_param_56: Tensor[(128), int32], %v_param_57: Tensor[(1, 1, 512, 128), uint8], %v_param_58: Tensor[(128), int32], %v_param_59: Tensor[(3, 3, 128, 256), uint8], %v_param_60: Tensor[(256), int32], %v_param_61: Tensor[(1, 1, 512, 24), uint8], %v_param_62: Tensor[(24), int32], %v_param_63: Tensor[(3, 3, 24, 64), uint8], %v_param_64: Tensor[(64), int32], %v_param_65: Tensor[(1, 1, 512, 64), uint8], %v_param_66: Tensor[(64), int32], %v_param_67: Tensor[(1, 1, 512, 112), uint8], %v_param_68: Tensor[(112), int32], %v_param_69: Tensor[(1, 1, 512, 144), uint8], %v_param_70: Tensor[(144), int32], %v_param_71: Tensor[(3, 3, 144, 288), uint8], %v_param_72: Tensor[(288), int32], %v_param_73: Tensor[(1, 1, 512, 32), uint8], %v_param_74: Tensor[(32), int32], %v_param_75: Tensor[(3, 3, 32, 64), uint8], %v_param_76: Tensor[(64), int32], %v_param_77: Tensor[(1, 1, 512, 64), uint8], %v_param_78: Tensor[(64), int32], %v_param_79: Tensor[(1, 1, 528, 256), uint8], %v_param_80: Tensor[(256), int32], %v_param_81: Tensor[(1, 1, 528, 160), uint8], %v_param_82: Tensor[(160), int32], %v_param_83: Tensor[(3, 3, 160, 320), uint8], %v_param_84: Tensor[(320), int32], %v_param_85: Tensor[(1, 1, 528, 32), uint8], %v_param_86: Tensor[(32), int32], %v_param_87: Tensor[(3, 3, 32, 128), uint8], %v_param_88: Tensor[(128), int32], %v_param_89: Tensor[(1, 1, 528, 128), uint8], %v_param_90: Tensor[(128), int32], %v_param_91: Tensor[(1, 1, 832, 256), uint8], %v_param_92: Tensor[(256), int32], %v_param_93: Tensor[(1, 1, 832, 160), uint8], %v_param_94: Tensor[(160), int32], %v_param_95: Tensor[(3, 3, 160, 320), uint8], %v_param_96: Tensor[(320), int32], %v_param_97: Tensor[(1, 1, 832, 32), uint8], %v_param_98: Tensor[(32), int32], %v_param_99: Tensor[(3, 3, 32, 128), uint8], %v_param_100: Tensor[(128), int32], %v_param_101: Tensor[(1, 1, 832, 128), uint8], %v_param_102: Tensor[(128), int32], %v_param_103: Tensor[(1, 1, 832, 384), uint8], %v_param_104: Tensor[(384), int32], %v_param_105: Tensor[(1, 1, 832, 192), uint8], %v_param_106: Tensor[(192), int32], %v_param_107: Tensor[(3, 3, 192, 384), uint8], %v_param_108: Tensor[(384), int32], %v_param_109: Tensor[(1, 1, 832, 48), uint8], %v_param_110: Tensor[(48), int32], %v_param_111: Tensor[(3, 3, 48, 128), uint8], %v_param_112: Tensor[(128), int32], %v_param_113: Tensor[(1, 1, 832, 128), uint8], %v_param_114: Tensor[(128), int32], %v_param_115: Tensor[(1, 1, 1024, 1001), uint8], %v_param_116: Tensor[(1001), int32]) { %0 = qnn.conv2d(%input, %v_param_1, 128, 141, 0.0078125f, 0.0243229f, strides=[2, 2], padding=[2, 2, 3, 3], channels=64, kernel_size=[7, 7], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %1 = nn.bias_add(%0, %v_param_2, axis=3); %2 = qnn.requantize(%1, 0.000190023f, 0, 0.107703f, 0, axis=3, out_dtype="uint8"); %3 = nn.max_pool2d(%2, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC"); %4 = qnn.conv2d(%3, %v_param_3, 0, 134, 0.107703f, 0.0171319f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %5 = nn.bias_add(%4, %v_param_4, axis=3); %6 = qnn.requantize(%5, 0.00184516f, 0, 0.053206f, 0, axis=3, out_dtype="uint8"); %7 = qnn.conv2d(%6, %v_param_5, 0, 137, 0.053206f, 0.00701139f, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %8 = nn.bias_add(%7, %v_param_6, axis=3); %9 = qnn.requantize(%8, 0.000373048f, 0, 0.044983f, 0, axis=3, out_dtype="uint8"); %10 = nn.max_pool2d(%9, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC"); %11 = qnn.conv2d(%10, %v_param_7, 0, 106, 0.044983f, 0.00639617f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %12 = nn.bias_add(%11, %v_param_8, axis=3); %13 = qnn.conv2d(%10, %v_param_9, 0, 174, 0.044983f, 0.0074075f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %14 = nn.bias_add(%13, %v_param_10, axis=3); %15 = qnn.requantize(%14, 0.000333212f, 0, 0.0381216f, 0, axis=3, out_dtype="uint8"); %16 = qnn.conv2d(%15, %v_param_11, 0, 97, 0.0381216f, 0.00448481f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %17 = nn.bias_add(%16, %v_param_12, axis=3); %18 = qnn.conv2d(%10, %v_param_13, 0, 90, 0.044983f, 0.00434916f, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %19 = nn.bias_add(%18, %v_param_14, axis=3); %20 = qnn.requantize(%19, 0.000195639f, 0, 0.0304856f, 0, axis=3, out_dtype="uint8"); %21 = qnn.conv2d(%20, %v_param_15, 0, 77, 0.0304856f, 0.0113698f, padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %22 = nn.bias_add(%21, %v_param_16, axis=3); %23 = nn.max_pool2d(%10, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %24 = qnn.conv2d(%23, %v_param_17, 0, 149, 0.044983f, 0.00737061f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %25 = nn.bias_add(%24, %v_param_18, axis=3); %26 = qnn.requantize(%12, 0.000287719f, 0, 0.0475482f, 0, axis=3, out_dtype="uint8"); %27 = qnn.requantize(%17, 0.000170968f, 0, 0.034202f, 0, axis=3, out_dtype="uint8"); %28 = qnn.requantize(%22, 0.000346614f, 0, 0.0420845f, 0, axis=3, out_dtype="uint8"); %29 = qnn.requantize(%25, 0.000331553f, 0, 0.02516f, 0, axis=3, out_dtype="uint8"); %30 = (%26, %27, %28, %29); %31 = (0.0475482f, 0.034202f, 0.0420845f, 0.02516f); %32 = (0, 0, 0, 0); %33 = qnn.concatenate(%30, %31, %32, 0.0475482f, 0, axis=3); %34 = qnn.conv2d(%33, %v_param_19, 0, 135, 0.0475482f, 0.0064377f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %35 = nn.bias_add(%34, %v_param_20, axis=3); %36 = qnn.conv2d(%33, %v_param_21, 0, 133, 0.0475482f, 0.00539997f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %37 = nn.bias_add(%36, %v_param_22, axis=3); %38 = qnn.requantize(%37, 0.000256759f, 0, 0.0317389f, 0, axis=3, out_dtype="uint8"); %39 = qnn.conv2d(%38, %v_param_23, 0, 94, 0.0317389f, 0.00359896f, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %40 = nn.bias_add(%39, %v_param_24, axis=3); %41 = qnn.conv2d(%33, %v_param_25, 0, 129, 0.0475482f, 0.00531897f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %42 = nn.bias_add(%41, %v_param_26, axis=3); %43 = qnn.requantize(%42, 0.000252907f, 0, 0.034475f, 0, axis=3, out_dtype="uint8"); %44 = qnn.conv2d(%43, %v_param_27, 0, 121, 0.034475f, 0.00415084f, padding=[1, 1, 1, 1], channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %45 = nn.bias_add(%44, %v_param_28, axis=3); %46 = nn.max_pool2d(%33, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %47 = qnn.conv2d(%46, %v_param_29, 0, 129, 0.0475482f, 0.00529972f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %48 = nn.bias_add(%47, %v_param_30, axis=3); %49 = qnn.requantize(%35, 0.000306101f, 0, 0.034585f, 0, axis=3, out_dtype="uint8"); %50 = qnn.requantize(%40, 0.000114227f, 0, 0.0316799f, 0, axis=3, out_dtype="uint8"); %51 = qnn.requantize(%45, 0.0001431f, 0, 0.0277635f, 0, axis=3, out_dtype="uint8"); %52 = qnn.requantize(%48, 0.000251992f, 0, 0.0281896f, 0, axis=3, out_dtype="uint8"); %53 = (%49, %50, %51, %52); %54 = (0.034585f, 0.0316799f, 0.0277635f, 0.0281896f); %55 = (0, 0, 0, 0); %56 = qnn.concatenate(%53, %54, %55, 0.034585f, 0, axis=3); %57 = nn.max_pool2d(%56, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC"); %58 = qnn.conv2d(%57, %v_param_31, 0, 104, 0.034585f, 0.00488506f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %59 = nn.bias_add(%58, %v_param_32, axis=3); %60 = qnn.conv2d(%57, %v_param_33, 0, 69, 0.034585f, 0.00521668f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %61 = nn.bias_add(%60, %v_param_34, axis=3); %62 = qnn.requantize(%61, 0.000180419f, 0, 0.0407384f, 0, axis=3, out_dtype="uint8"); %63 = qnn.conv2d(%62, %v_param_35, 0, 80, 0.0407384f, 0.00412294f, padding=[1, 1, 1, 1], channels=208, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %64 = nn.bias_add(%63, %v_param_36, axis=3); %65 = qnn.conv2d(%57, %v_param_37, 0, 159, 0.034585f, 0.00324746f, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %66 = nn.bias_add(%65, %v_param_38, axis=3); %67 = qnn.requantize(%66, 0.000112313f, 0, 0.029503f, 0, axis=3, out_dtype="uint8"); %68 = qnn.conv2d(%67, %v_param_39, 0, 88, 0.029503f, 0.00959363f, padding=[1, 1, 1, 1], channels=48, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %69 = nn.bias_add(%68, %v_param_40, axis=3); %70 = nn.max_pool2d(%57, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %71 = qnn.conv2d(%70, %v_param_41, 0, 123, 0.034585f, 0.0063726f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %72 = nn.bias_add(%71, %v_param_42, axis=3); %73 = qnn.requantize(%59, 0.00016895f, 0, 0.0350619f, 0, axis=3, out_dtype="uint8"); %74 = qnn.requantize(%64, 0.000167962f, 0, 0.038577f, 0, axis=3, out_dtype="uint8"); %75 = qnn.requantize(%69, 0.000283041f, 0, 0.0261499f, 0, axis=3, out_dtype="uint8"); %76 = qnn.requantize(%72, 0.000220396f, 0, 0.0227659f, 0, axis=3, out_dtype="uint8"); %77 = (%73, %74, %75, %76); %78 = (0.0350619f, 0.038577f, 0.0261499f, 0.0227659f); %79 = (0, 0, 0, 0); %80 = qnn.concatenate(%77, %78, %79, 0.038577f, 0, axis=3); %81 = qnn.conv2d(%80, %v_param_43, 0, 131, 0.038577f, 0.00565282f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %82 = nn.bias_add(%81, %v_param_44, axis=3); %83 = qnn.conv2d(%80, %v_param_45, 0, 111, 0.038577f, 0.00606403f, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %84 = nn.bias_add(%83, %v_param_46, axis=3); %85 = qnn.requantize(%84, 0.000233932f, 0, 0.0390984f, 0, axis=3, out_dtype="uint8"); %86 = qnn.conv2d(%85, %v_param_47, 0, 77, 0.0390984f, 0.00476621f, padding=[1, 1, 1, 1], channels=224, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %87 = nn.bias_add(%86, %v_param_48, axis=3); %88 = qnn.conv2d(%80, %v_param_49, 0, 127, 0.038577f, 0.00466451f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %89 = nn.bias_add(%88, %v_param_50, axis=3); %90 = qnn.requantize(%89, 0.000179943f, 0, 0.0326719f, 0, axis=3, out_dtype="uint8"); %91 = qnn.conv2d(%90, %v_param_51, 0, 105, 0.0326719f, 0.00475245f, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %92 = nn.bias_add(%91, %v_param_52, axis=3); %93 = nn.max_pool2d(%80, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %94 = qnn.conv2d(%93, %v_param_53, 0, 128, 0.038577f, 0.00292699f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %95 = nn.bias_add(%94, %v_param_54, axis=3); %96 = qnn.requantize(%82, 0.000218069f, 0, 0.0384053f, 0, axis=3, out_dtype="uint8"); %97 = qnn.requantize(%87, 0.000186351f, 0, 0.0415277f, 0, axis=3, out_dtype="uint8"); %98 = qnn.requantize(%92, 0.000155272f, 0, 0.0353133f, 0, axis=3, out_dtype="uint8"); %99 = qnn.requantize(%95, 0.000112914f, 0, 0.0217496f, 0, axis=3, out_dtype="uint8"); %100 = (%96, %97, %98, %99); %101 = (0.0384053f, 0.0415277f, 0.0353133f, 0.0217496f); %102 = (0, 0, 0, 0); %103 = qnn.concatenate(%100, %101, %102, 0.0415277f, 0, axis=3); %104 = qnn.conv2d(%103, %v_param_55, 0, 143, 0.0415277f, 0.00513341f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %105 = nn.bias_add(%104, %v_param_56, axis=3); %106 = qnn.conv2d(%103, %v_param_57, 0, 125, 0.0415277f, 0.0056437f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %107 = nn.bias_add(%106, %v_param_58, axis=3); %108 = qnn.requantize(%107, 0.00023437f, 0, 0.0444829f, 0, axis=3, out_dtype="uint8"); %109 = qnn.conv2d(%108, %v_param_59, 0, 104, 0.0444829f, 0.00298305f, padding=[1, 1, 1, 1], channels=256, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %110 = nn.bias_add(%109, %v_param_60, axis=3); %111 = qnn.conv2d(%103, %v_param_61, 0, 96, 0.0415277f, 0.00617409f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %112 = nn.bias_add(%111, %v_param_62, axis=3); %113 = qnn.requantize(%112, 0.000256396f, 0, 0.0382293f, 0, axis=3, out_dtype="uint8"); %114 = qnn.conv2d(%113, %v_param_63, 0, 90, 0.0382293f, 0.00926049f, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %115 = nn.bias_add(%114, %v_param_64, axis=3); %116 = nn.max_pool2d(%103, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %117 = qnn.conv2d(%116, %v_param_65, 0, 133, 0.0415277f, 0.00348826f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %118 = nn.bias_add(%117, %v_param_66, axis=3); %119 = qnn.requantize(%105, 0.000213179f, 0, 0.0363159f, 0, axis=3, out_dtype="uint8"); %120 = qnn.requantize(%110, 0.000132695f, 0, 0.040194f, 0, axis=3, out_dtype="uint8"); %121 = qnn.requantize(%115, 0.000354022f, 0, 0.0679776f, 0, axis=3, out_dtype="uint8"); %122 = qnn.requantize(%118, 0.00014486f, 0, 0.0225817f, 0, axis=3, out_dtype="uint8"); %123 = (%119, %120, %121, %122); %124 = (0.0363159f, 0.040194f, 0.0679776f, 0.0225817f); %125 = (0, 0, 0, 0); %126 = qnn.concatenate(%123, %124, %125, 0.0679776f, 0, axis=3); %127 = qnn.conv2d(%126, %v_param_67, 0, 131, 0.0679776f, 0.00541721f, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %128 = nn.bias_add(%127, %v_param_68, axis=3); %129 = qnn.conv2d(%126, %v_param_69, 0, 102, 0.0679776f, 0.00529131f, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %130 = nn.bias_add(%129, %v_param_70, axis=3); %131 = qnn.requantize(%130, 0.000359691f, 0, 0.0464631f, 0, axis=3, out_dtype="uint8"); %132 = qnn.conv2d(%131, %v_param_71, 0, 121, 0.0464631f, 0.00281512f, padding=[1, 1, 1, 1], channels=288, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %133 = nn.bias_add(%132, %v_param_72, axis=3); %134 = qnn.conv2d(%126, %v_param_73, 0, 129, 0.0679776f, 0.00454161f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %135 = nn.bias_add(%134, %v_param_74, axis=3); %136 = qnn.requantize(%135, 0.000308728f, 0, 0.0439514f, 0, axis=3, out_dtype="uint8"); %137 = qnn.conv2d(%136, %v_param_75, 0, 92, 0.0439514f, 0.00496321f, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %138 = nn.bias_add(%137, %v_param_76, axis=3); %139 = nn.max_pool2d(%126, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %140 = qnn.conv2d(%139, %v_param_77, 0, 124, 0.0679776f, 0.00317437f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %141 = nn.bias_add(%140, %v_param_78, axis=3); %142 = qnn.requantize(%128, 0.000368249f, 0, 0.0520244f, 0, axis=3, out_dtype="uint8"); %143 = qnn.requantize(%133, 0.000130799f, 0, 0.0511231f, 0, axis=3, out_dtype="uint8"); %144 = qnn.requantize(%138, 0.00021814f, 0, 0.0310861f, 0, axis=3, out_dtype="uint8"); %145 = qnn.requantize(%141, 0.000215786f, 0, 0.024479f, 0, axis=3, out_dtype="uint8"); %146 = (%142, %143, %144, %145); %147 = (0.0520244f, 0.0511231f, 0.0310861f, 0.024479f); %148 = (0, 0, 0, 0); %149 = qnn.concatenate(%146, %147, %148, 0.0520244f, 0, axis=3); %150 = qnn.conv2d(%149, %v_param_79, 0, 118, 0.0520244f, 0.00557758f, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %151 = nn.bias_add(%150, %v_param_80, axis=3); %152 = qnn.conv2d(%149, %v_param_81, 0, 105, 0.0520244f, 0.00543337f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %153 = nn.bias_add(%152, %v_param_82, axis=3); %154 = qnn.requantize(%153, 0.000282668f, 0, 0.0368424f, 0, axis=3, out_dtype="uint8"); %155 = qnn.conv2d(%154, %v_param_83, 0, 85, 0.0368424f, 0.00295774f, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %156 = nn.bias_add(%155, %v_param_84, axis=3); %157 = qnn.conv2d(%149, %v_param_85, 0, 126, 0.0520244f, 0.00506661f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %158 = nn.bias_add(%157, %v_param_86, axis=3); %159 = qnn.requantize(%158, 0.000263587f, 0, 0.0576595f, 0, axis=3, out_dtype="uint8"); %160 = qnn.conv2d(%159, %v_param_87, 0, 81, 0.0576595f, 0.00359061f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %161 = nn.bias_add(%160, %v_param_88, axis=3); %162 = nn.max_pool2d(%149, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %163 = qnn.conv2d(%162, %v_param_89, 0, 94, 0.0520244f, 0.00317797f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %164 = nn.bias_add(%163, %v_param_90, axis=3); %165 = qnn.requantize(%151, 0.00029017f, 0, 0.0461338f, 0, axis=3, out_dtype="uint8"); %166 = qnn.requantize(%156, 0.00010897f, 0, 0.0384801f, 0, axis=3, out_dtype="uint8"); %167 = qnn.requantize(%161, 0.000207033f, 0, 0.0713473f, 0, axis=3, out_dtype="uint8"); %168 = qnn.requantize(%164, 0.000165332f, 0, 0.0265916f, 0, axis=3, out_dtype="uint8"); %169 = (%165, %166, %167, %168); %170 = (0.0461338f, 0.0384801f, 0.0713473f, 0.0265916f); %171 = (0, 0, 0, 0); %172 = qnn.concatenate(%169, %170, %171, 0.0713473f, 0, axis=3); %173 = nn.max_pool2d(%172, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 0, 0], layout="NHWC"); %174 = qnn.conv2d(%173, %v_param_91, 0, 182, 0.0713473f, 0.0104061f, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %175 = nn.bias_add(%174, %v_param_92, axis=3); %176 = qnn.conv2d(%173, %v_param_93, 0, 115, 0.0713473f, 0.00596868f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %177 = nn.bias_add(%176, %v_param_94, axis=3); %178 = qnn.requantize(%177, 0.000425849f, 0, 0.0490709f, 0, axis=3, out_dtype="uint8"); %179 = qnn.conv2d(%178, %v_param_95, 0, 129, 0.0490709f, 0.00293286f, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %180 = nn.bias_add(%179, %v_param_96, axis=3); %181 = qnn.conv2d(%173, %v_param_97, 0, 122, 0.0713473f, 0.00383815f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %182 = nn.bias_add(%181, %v_param_98, axis=3); %183 = qnn.requantize(%182, 0.000273842f, 0, 0.0411565f, 0, axis=3, out_dtype="uint8"); %184 = qnn.conv2d(%183, %v_param_99, 0, 102, 0.0411565f, 0.002763f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %185 = nn.bias_add(%184, %v_param_100, axis=3); %186 = nn.max_pool2d(%173, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %187 = qnn.conv2d(%186, %v_param_101, 0, 123, 0.0713473f, 0.00247852f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %188 = nn.bias_add(%187, %v_param_102, axis=3); %189 = qnn.requantize(%175, 0.000742446f, 0, 0.036752f, 0, axis=3, out_dtype="uint8"); %190 = qnn.requantize(%180, 0.000143918f, 0, 0.0450282f, 0, axis=3, out_dtype="uint8"); %191 = qnn.requantize(%185, 0.000113715f, 0, 0.0371453f, 0, axis=3, out_dtype="uint8"); %192 = qnn.requantize(%188, 0.000176836f, 0, 0.0213327f, 0, axis=3, out_dtype="uint8"); %193 = (%189, %190, %191, %192); %194 = (0.036752f, 0.0450282f, 0.0371453f, 0.0213327f); %195 = (0, 0, 0, 0); %196 = qnn.concatenate(%193, %194, %195, 0.0450282f, 0, axis=3); %197 = qnn.conv2d(%196, %v_param_103, 0, 104, 0.0450282f, 0.0143784f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %198 = nn.bias_add(%197, %v_param_104, axis=3); %199 = qnn.conv2d(%196, %v_param_105, 0, 81, 0.0450282f, 0.00580293f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %200 = nn.bias_add(%199, %v_param_106, axis=3); %201 = qnn.requantize(%200, 0.000261295f, 0, 0.0443907f, 0, axis=3, out_dtype="uint8"); %202 = qnn.conv2d(%201, %v_param_107, 0, 83, 0.0443907f, 0.00505402f, padding=[1, 1, 1, 1], channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %203 = nn.bias_add(%202, %v_param_108, axis=3); %204 = qnn.conv2d(%196, %v_param_109, 0, 87, 0.0450282f, 0.00578726f, padding=[0, 0, 0, 0], channels=48, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %205 = nn.bias_add(%204, %v_param_110, axis=3); %206 = qnn.requantize(%205, 0.00026059f, 0, 0.0431175f, 0, axis=3, out_dtype="uint8"); %207 = qnn.conv2d(%206, %v_param_111, 0, 74, 0.0431175f, 0.00680263f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %208 = nn.bias_add(%207, %v_param_112, axis=3); %209 = nn.max_pool2d(%196, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC"); %210 = qnn.conv2d(%209, %v_param_113, 0, 62, 0.0450282f, 0.0055094f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %211 = nn.bias_add(%210, %v_param_114, axis=3); %212 = qnn.requantize(%198, 0.000647432f, 0, 0.0470831f, 0, axis=3, out_dtype="uint8"); %213 = qnn.requantize(%203, 0.000224351f, 0, 0.0483342f, 0, axis=3, out_dtype="uint8"); %214 = qnn.requantize(%208, 0.000293312f, 0, 0.0535589f, 0, axis=3, out_dtype="uint8"); %215 = qnn.requantize(%211, 0.000248078f, 0, 0.0320987f, 0, axis=3, out_dtype="uint8"); %216 = (%212, %213, %214, %215); %217 = (0.0470831f, 0.0483342f, 0.0535589f, 0.0320987f); %218 = (0, 0, 0, 0); %219 = qnn.concatenate(%216, %217, %218, 0.0535589f, 0, axis=3); %220 = cast(%219, dtype="int32"); %221 = nn.avg_pool2d(%220, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC"); %222 = cast(%221, dtype="uint8"); %223 = qnn.conv2d(%222, %v_param_115, 0, 106, 0.0535589f, 0.00235748f, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %224 = nn.bias_add(%223, %v_param_116, axis=3); %225 = qnn.requantize(%224, 0.000126264f, 0, 0.0962827f, 60, axis=3, out_dtype="uint8"); %226 = reshape(%225, newshape=[-1, 1001]); %227 = qnn.dequantize(%226, 0.0962827f, 60); %228 = nn.softmax(%227, axis=1); qnn.quantize(%228, 0.00390625f, 0, out_dtype="uint8") } vsi_npu.py --> qnn.dequantize vsi_npu.py --> nn.softmax vsi_npu.py --> qnn.quantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.avg_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> reshape def @main(%input: Tensor[(1, 224, 224, 3), uint8]) -> Tensor[(1, 1001), uint8] { @tvmgen_default_vsi_npu_0(%input) /* ty=Tensor[(1, 1001), uint8] */ } def @tvmgen_default_vsi_npu_0(%vsi_npu_0_i0: Tensor[(1, 224, 224, 3), uint8], Inline=1, Compiler="vsi_npu", global_symbol="tvmgen_default_vsi_npu_0", Primitive=1) -> Tensor[(1, 1001), uint8] { %30 = fn (%FunctionVar_57_0: Tensor[(1, 224, 224, 3), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 64), uint8] { %28 = qnn.conv2d(%FunctionVar_57_0, meta[relay.Constant][24] /* ty=Tensor[(7, 7, 3, 64), uint8] */, 128 /* ty=int32 */, 141 /* ty=int32 */, 0.0078125f /* ty=float32 */, 0.0243229f /* ty=float32 */, strides=[2, 2], padding=[2, 2, 3, 3], channels=64, kernel_size=[7, 7], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 64), int32] */; %29 = nn.bias_add(%28, meta[relay.Constant][25] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 64), int32] */; qnn.requantize(%29, 0.000190023f /* ty=float32 */, 0 /* ty=int32 */, 0.107703f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 64), uint8] */ }; %31 = %30(%vsi_npu_0_i0) /* ty=Tensor[(1, 112, 112, 64), uint8] */; %32 = nn.max_pool2d(%31, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC") /* ty=Tensor[(1, 56, 56, 64), uint8] */; %33 = fn (%FunctionVar_56_0: Tensor[(1, 56, 56, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 64), uint8] { %26 = qnn.conv2d(%FunctionVar_56_0, meta[relay.Constant][22] /* ty=Tensor[(1, 1, 64, 64), uint8] */, 0 /* ty=int32 */, 134 /* ty=int32 */, 0.107703f /* ty=float32 */, 0.0171319f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 64), int32] */; %27 = nn.bias_add(%26, meta[relay.Constant][23] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 64), int32] */; qnn.requantize(%27, 0.00184516f /* ty=float32 */, 0 /* ty=int32 */, 0.053206f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 64), uint8] */ }; %34 = %33(%32) /* ty=Tensor[(1, 56, 56, 64), uint8] */; %35 = fn (%FunctionVar_55_0: Tensor[(1, 56, 56, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 192), uint8] { %24 = qnn.conv2d(%FunctionVar_55_0, meta[relay.Constant][20] /* ty=Tensor[(3, 3, 64, 192), uint8] */, 0 /* ty=int32 */, 137 /* ty=int32 */, 0.053206f /* ty=float32 */, 0.00701139f /* ty=float32 */, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 192), int32] */; %25 = nn.bias_add(%24, meta[relay.Constant][21] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 192), int32] */; qnn.requantize(%25, 0.000373048f /* ty=float32 */, 0 /* ty=int32 */, 0.044983f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 192), uint8] */ }; %36 = %35(%34) /* ty=Tensor[(1, 56, 56, 192), uint8] */; %37 = nn.max_pool2d(%36, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC") /* ty=Tensor[(1, 28, 28, 192), uint8] */; %38 = fn (%FunctionVar_54_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 64), uint8] { %22 = qnn.conv2d(%FunctionVar_54_0, meta[relay.Constant][18] /* ty=Tensor[(1, 1, 192, 64), uint8] */, 0 /* ty=int32 */, 106 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.00639617f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 64), int32] */; %23 = nn.bias_add(%22, meta[relay.Constant][19] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 64), int32] */; qnn.requantize(%23, 0.000287719f /* ty=float32 */, 0 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 64), uint8] */ }; %43 = fn (%FunctionVar_53_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 96), uint8] { %41 = qnn.conv2d(%FunctionVar_53_0, meta[relay.Constant][28] /* ty=Tensor[(1, 1, 192, 96), uint8] */, 0 /* ty=int32 */, 174 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.0074075f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 96), int32] */; %42 = nn.bias_add(%41, meta[relay.Constant][29] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 96), int32] */; qnn.requantize(%42, 0.000333212f /* ty=float32 */, 0 /* ty=int32 */, 0.0381216f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 96), uint8] */ }; %44 = %43(%37) /* ty=Tensor[(1, 28, 28, 96), uint8] */; %45 = fn (%FunctionVar_52_0: Tensor[(1, 28, 28, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 128), uint8] { %39 = qnn.conv2d(%FunctionVar_52_0, meta[relay.Constant][26] /* ty=Tensor[(3, 3, 96, 128), uint8] */, 0 /* ty=int32 */, 97 /* ty=int32 */, 0.0381216f /* ty=float32 */, 0.00448481f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 128), int32] */; %40 = nn.bias_add(%39, meta[relay.Constant][27] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 128), int32] */; qnn.requantize(%40, 0.000170968f /* ty=float32 */, 0 /* ty=int32 */, 0.034202f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 128), uint8] */ }; %50 = fn (%FunctionVar_51_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 16), uint8] { %48 = qnn.conv2d(%FunctionVar_51_0, meta[relay.Constant][32] /* ty=Tensor[(1, 1, 192, 16), uint8] */, 0 /* ty=int32 */, 90 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.00434916f /* ty=float32 */, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 16), int32] */; %49 = nn.bias_add(%48, meta[relay.Constant][33] /* ty=Tensor[(16), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 16), int32] */; qnn.requantize(%49, 0.000195639f /* ty=float32 */, 0 /* ty=int32 */, 0.0304856f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 16), uint8] */ }; %51 = %50(%37) /* ty=Tensor[(1, 28, 28, 16), uint8] */; %52 = fn (%FunctionVar_50_0: Tensor[(1, 28, 28, 16), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] { %46 = qnn.conv2d(%FunctionVar_50_0, meta[relay.Constant][30] /* ty=Tensor[(3, 3, 16, 32), uint8] */, 0 /* ty=int32 */, 77 /* ty=int32 */, 0.0304856f /* ty=float32 */, 0.0113698f /* ty=float32 */, padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */; %47 = nn.bias_add(%46, meta[relay.Constant][31] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */; qnn.requantize(%47, 0.000346614f /* ty=float32 */, 0 /* ty=int32 */, 0.0420845f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */ }; %55 = nn.max_pool2d(%37, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 28, 28, 192), uint8] */; %56 = fn (%FunctionVar_49_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] { %53 = qnn.conv2d(%FunctionVar_49_0, meta[relay.Constant][34] /* ty=Tensor[(1, 1, 192, 32), uint8] */, 0 /* ty=int32 */, 149 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.00737061f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */; %54 = nn.bias_add(%53, meta[relay.Constant][35] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */; qnn.requantize(%54, 0.000331553f /* ty=float32 */, 0 /* ty=int32 */, 0.02516f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */ }; %57 = %38(%37) /* ty=Tensor[(1, 28, 28, 64), uint8] */; %58 = %45(%44) /* ty=Tensor[(1, 28, 28, 128), uint8] */; %59 = %52(%51) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %60 = %56(%55) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %61 = (%57, %58, %59, %60); %62 = (0.0475482f /* ty=float32 */, 0.034202f /* ty=float32 */, 0.0420845f /* ty=float32 */, 0.02516f /* ty=float32 */); %63 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %64 = qnn.concatenate(%61, %62, %63, 0.0475482f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 28, 28, 256), uint8] */; %65 = fn (%FunctionVar_48_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 128), uint8] { %20 = qnn.conv2d(%FunctionVar_48_0, meta[relay.Constant][16] /* ty=Tensor[(1, 1, 256, 128), uint8] */, 0 /* ty=int32 */, 135 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.0064377f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 128), int32] */; %21 = nn.bias_add(%20, meta[relay.Constant][17] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 128), int32] */; qnn.requantize(%21, 0.000306101f /* ty=float32 */, 0 /* ty=int32 */, 0.034585f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 128), uint8] */ }; %70 = fn (%FunctionVar_47_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 128), uint8] { %68 = qnn.conv2d(%FunctionVar_47_0, meta[relay.Constant][38] /* ty=Tensor[(1, 1, 256, 128), uint8] */, 0 /* ty=int32 */, 133 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.00539997f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 128), int32] */; %69 = nn.bias_add(%68, meta[relay.Constant][39] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 128), int32] */; qnn.requantize(%69, 0.000256759f /* ty=float32 */, 0 /* ty=int32 */, 0.0317389f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 128), uint8] */ }; %71 = %70(%64) /* ty=Tensor[(1, 28, 28, 128), uint8] */; %72 = fn (%FunctionVar_46_0: Tensor[(1, 28, 28, 128), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] { %66 = qnn.conv2d(%FunctionVar_46_0, meta[relay.Constant][36] /* ty=Tensor[(3, 3, 128, 192), uint8] */, 0 /* ty=int32 */, 94 /* ty=int32 */, 0.0317389f /* ty=float32 */, 0.00359896f /* ty=float32 */, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */; %67 = nn.bias_add(%66, meta[relay.Constant][37] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */; qnn.requantize(%67, 0.000114227f /* ty=float32 */, 0 /* ty=int32 */, 0.0316799f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */ }; %77 = fn (%FunctionVar_45_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] { %75 = qnn.conv2d(%FunctionVar_45_0, meta[relay.Constant][42] /* ty=Tensor[(1, 1, 256, 32), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.00531897f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */; %76 = nn.bias_add(%75, meta[relay.Constant][43] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */; qnn.requantize(%76, 0.000252907f /* ty=float32 */, 0 /* ty=int32 */, 0.034475f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */ }; %78 = %77(%64) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %79 = fn (%FunctionVar_44_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 96), uint8] { %73 = qnn.conv2d(%FunctionVar_44_0, meta[relay.Constant][40] /* ty=Tensor[(3, 3, 32, 96), uint8] */, 0 /* ty=int32 */, 121 /* ty=int32 */, 0.034475f /* ty=float32 */, 0.00415084f /* ty=float32 */, padding=[1, 1, 1, 1], channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 96), int32] */; %74 = nn.bias_add(%73, meta[relay.Constant][41] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 96), int32] */; qnn.requantize(%74, 0.0001431f /* ty=float32 */, 0 /* ty=int32 */, 0.0277635f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 96), uint8] */ }; %82 = nn.max_pool2d(%64, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 28, 28, 256), uint8] */; %83 = fn (%FunctionVar_43_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 64), uint8] { %80 = qnn.conv2d(%FunctionVar_43_0, meta[relay.Constant][44] /* ty=Tensor[(1, 1, 256, 64), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.00529972f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 64), int32] */; %81 = nn.bias_add(%80, meta[relay.Constant][45] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 64), int32] */; qnn.requantize(%81, 0.000251992f /* ty=float32 */, 0 /* ty=int32 */, 0.0281896f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 64), uint8] */ }; %84 = %65(%64) /* ty=Tensor[(1, 28, 28, 128), uint8] */; %85 = %72(%71) /* ty=Tensor[(1, 28, 28, 192), uint8] */; %86 = %79(%78) /* ty=Tensor[(1, 28, 28, 96), uint8] */; %87 = %83(%82) /* ty=Tensor[(1, 28, 28, 64), uint8] */; %88 = (%84, %85, %86, %87); %89 = (0.034585f /* ty=float32 */, 0.0316799f /* ty=float32 */, 0.0277635f /* ty=float32 */, 0.0281896f /* ty=float32 */); %90 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %91 = qnn.concatenate(%88, %89, %90, 0.034585f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 28, 28, 480), uint8] */; %92 = nn.max_pool2d(%91, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 480), uint8] */; %93 = fn (%FunctionVar_42_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 192), uint8] { %18 = qnn.conv2d(%FunctionVar_42_0, meta[relay.Constant][14] /* ty=Tensor[(1, 1, 480, 192), uint8] */, 0 /* ty=int32 */, 104 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.00488506f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 192), int32] */; %19 = nn.bias_add(%18, meta[relay.Constant][15] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 192), int32] */; qnn.requantize(%19, 0.00016895f /* ty=float32 */, 0 /* ty=int32 */, 0.0350619f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 192), uint8] */ }; %98 = fn (%FunctionVar_41_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] { %96 = qnn.conv2d(%FunctionVar_41_0, meta[relay.Constant][48] /* ty=Tensor[(1, 1, 480, 96), uint8] */, 0 /* ty=int32 */, 69 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.00521668f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */; %97 = nn.bias_add(%96, meta[relay.Constant][49] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */; qnn.requantize(%97, 0.000180419f /* ty=float32 */, 0 /* ty=int32 */, 0.0407384f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */ }; %99 = %98(%92) /* ty=Tensor[(1, 14, 14, 96), uint8] */; %100 = fn (%FunctionVar_40_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 208), uint8] { %94 = qnn.conv2d(%FunctionVar_40_0, meta[relay.Constant][46] /* ty=Tensor[(3, 3, 96, 208), uint8] */, 0 /* ty=int32 */, 80 /* ty=int32 */, 0.0407384f /* ty=float32 */, 0.00412294f /* ty=float32 */, padding=[1, 1, 1, 1], channels=208, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 208), int32] */; %95 = nn.bias_add(%94, meta[relay.Constant][47] /* ty=Tensor[(208), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 208), int32] */; qnn.requantize(%95, 0.000167962f /* ty=float32 */, 0 /* ty=int32 */, 0.038577f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 208), uint8] */ }; %105 = fn (%FunctionVar_39_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 16), uint8] { %103 = qnn.conv2d(%FunctionVar_39_0, meta[relay.Constant][52] /* ty=Tensor[(1, 1, 480, 16), uint8] */, 0 /* ty=int32 */, 159 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.00324746f /* ty=float32 */, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 16), int32] */; %104 = nn.bias_add(%103, meta[relay.Constant][53] /* ty=Tensor[(16), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 16), int32] */; qnn.requantize(%104, 0.000112313f /* ty=float32 */, 0 /* ty=int32 */, 0.029503f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 16), uint8] */ }; %106 = %105(%92) /* ty=Tensor[(1, 14, 14, 16), uint8] */; %107 = fn (%FunctionVar_38_0: Tensor[(1, 14, 14, 16), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 48), uint8] { %101 = qnn.conv2d(%FunctionVar_38_0, meta[relay.Constant][50] /* ty=Tensor[(3, 3, 16, 48), uint8] */, 0 /* ty=int32 */, 88 /* ty=int32 */, 0.029503f /* ty=float32 */, 0.00959363f /* ty=float32 */, padding=[1, 1, 1, 1], channels=48, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 48), int32] */; %102 = nn.bias_add(%101, meta[relay.Constant][51] /* ty=Tensor[(48), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 48), int32] */; qnn.requantize(%102, 0.000283041f /* ty=float32 */, 0 /* ty=int32 */, 0.0261499f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 48), uint8] */ }; %110 = nn.max_pool2d(%92, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 480), uint8] */; %111 = fn (%FunctionVar_37_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %108 = qnn.conv2d(%FunctionVar_37_0, meta[relay.Constant][54] /* ty=Tensor[(1, 1, 480, 64), uint8] */, 0 /* ty=int32 */, 123 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.0063726f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %109 = nn.bias_add(%108, meta[relay.Constant][55] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%109, 0.000220396f /* ty=float32 */, 0 /* ty=int32 */, 0.0227659f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %112 = %93(%92) /* ty=Tensor[(1, 14, 14, 192), uint8] */; %113 = %100(%99) /* ty=Tensor[(1, 14, 14, 208), uint8] */; %114 = %107(%106) /* ty=Tensor[(1, 14, 14, 48), uint8] */; %115 = %111(%110) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %116 = (%112, %113, %114, %115); %117 = (0.0350619f /* ty=float32 */, 0.038577f /* ty=float32 */, 0.0261499f /* ty=float32 */, 0.0227659f /* ty=float32 */); %118 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %119 = qnn.concatenate(%116, %117, %118, 0.038577f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 512), uint8] */; %120 = fn (%FunctionVar_36_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 160), uint8] { %16 = qnn.conv2d(%FunctionVar_36_0, meta[relay.Constant][12] /* ty=Tensor[(1, 1, 512, 160), uint8] */, 0 /* ty=int32 */, 131 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00565282f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 160), int32] */; %17 = nn.bias_add(%16, meta[relay.Constant][13] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 160), int32] */; qnn.requantize(%17, 0.000218069f /* ty=float32 */, 0 /* ty=int32 */, 0.0384053f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 160), uint8] */ }; %125 = fn (%FunctionVar_35_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 112), uint8] { %123 = qnn.conv2d(%FunctionVar_35_0, meta[relay.Constant][58] /* ty=Tensor[(1, 1, 512, 112), uint8] */, 0 /* ty=int32 */, 111 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00606403f /* ty=float32 */, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 112), int32] */; %124 = nn.bias_add(%123, meta[relay.Constant][59] /* ty=Tensor[(112), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 112), int32] */; qnn.requantize(%124, 0.000233932f /* ty=float32 */, 0 /* ty=int32 */, 0.0390984f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 112), uint8] */ }; %126 = %125(%119) /* ty=Tensor[(1, 14, 14, 112), uint8] */; %127 = fn (%FunctionVar_34_0: Tensor[(1, 14, 14, 112), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 224), uint8] { %121 = qnn.conv2d(%FunctionVar_34_0, meta[relay.Constant][56] /* ty=Tensor[(3, 3, 112, 224), uint8] */, 0 /* ty=int32 */, 77 /* ty=int32 */, 0.0390984f /* ty=float32 */, 0.00476621f /* ty=float32 */, padding=[1, 1, 1, 1], channels=224, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 224), int32] */; %122 = nn.bias_add(%121, meta[relay.Constant][57] /* ty=Tensor[(224), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 224), int32] */; qnn.requantize(%122, 0.000186351f /* ty=float32 */, 0 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 224), uint8] */ }; %132 = fn (%FunctionVar_33_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 24), uint8] { %130 = qnn.conv2d(%FunctionVar_33_0, meta[relay.Constant][62] /* ty=Tensor[(1, 1, 512, 24), uint8] */, 0 /* ty=int32 */, 127 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00466451f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 24), int32] */; %131 = nn.bias_add(%130, meta[relay.Constant][63] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 24), int32] */; qnn.requantize(%131, 0.000179943f /* ty=float32 */, 0 /* ty=int32 */, 0.0326719f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 24), uint8] */ }; %133 = %132(%119) /* ty=Tensor[(1, 14, 14, 24), uint8] */; %134 = fn (%FunctionVar_32_0: Tensor[(1, 14, 14, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %128 = qnn.conv2d(%FunctionVar_32_0, meta[relay.Constant][60] /* ty=Tensor[(3, 3, 24, 64), uint8] */, 0 /* ty=int32 */, 105 /* ty=int32 */, 0.0326719f /* ty=float32 */, 0.00475245f /* ty=float32 */, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %129 = nn.bias_add(%128, meta[relay.Constant][61] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%129, 0.000155272f /* ty=float32 */, 0 /* ty=int32 */, 0.0353133f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %137 = nn.max_pool2d(%119, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 512), uint8] */; %138 = fn (%FunctionVar_31_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %135 = qnn.conv2d(%FunctionVar_31_0, meta[relay.Constant][64] /* ty=Tensor[(1, 1, 512, 64), uint8] */, 0 /* ty=int32 */, 128 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00292699f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %136 = nn.bias_add(%135, meta[relay.Constant][65] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%136, 0.000112914f /* ty=float32 */, 0 /* ty=int32 */, 0.0217496f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %139 = %120(%119) /* ty=Tensor[(1, 14, 14, 160), uint8] */; %140 = %127(%126) /* ty=Tensor[(1, 14, 14, 224), uint8] */; %141 = %134(%133) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %142 = %138(%137) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %143 = (%139, %140, %141, %142); %144 = (0.0384053f /* ty=float32 */, 0.0415277f /* ty=float32 */, 0.0353133f /* ty=float32 */, 0.0217496f /* ty=float32 */); %145 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %146 = qnn.concatenate(%143, %144, %145, 0.0415277f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 512), uint8] */; %147 = fn (%FunctionVar_30_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] { %14 = qnn.conv2d(%FunctionVar_30_0, meta[relay.Constant][10] /* ty=Tensor[(1, 1, 512, 128), uint8] */, 0 /* ty=int32 */, 143 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.00513341f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */; %15 = nn.bias_add(%14, meta[relay.Constant][11] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */; qnn.requantize(%15, 0.000213179f /* ty=float32 */, 0 /* ty=int32 */, 0.0363159f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */ }; %152 = fn (%FunctionVar_29_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] { %150 = qnn.conv2d(%FunctionVar_29_0, meta[relay.Constant][68] /* ty=Tensor[(1, 1, 512, 128), uint8] */, 0 /* ty=int32 */, 125 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.0056437f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */; %151 = nn.bias_add(%150, meta[relay.Constant][69] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */; qnn.requantize(%151, 0.00023437f /* ty=float32 */, 0 /* ty=int32 */, 0.0444829f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */ }; %153 = %152(%146) /* ty=Tensor[(1, 14, 14, 128), uint8] */; %154 = fn (%FunctionVar_28_0: Tensor[(1, 14, 14, 128), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 256), uint8] { %148 = qnn.conv2d(%FunctionVar_28_0, meta[relay.Constant][66] /* ty=Tensor[(3, 3, 128, 256), uint8] */, 0 /* ty=int32 */, 104 /* ty=int32 */, 0.0444829f /* ty=float32 */, 0.00298305f /* ty=float32 */, padding=[1, 1, 1, 1], channels=256, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 256), int32] */; %149 = nn.bias_add(%148, meta[relay.Constant][67] /* ty=Tensor[(256), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 256), int32] */; qnn.requantize(%149, 0.000132695f /* ty=float32 */, 0 /* ty=int32 */, 0.040194f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 256), uint8] */ }; %159 = fn (%FunctionVar_27_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 24), uint8] { %157 = qnn.conv2d(%FunctionVar_27_0, meta[relay.Constant][72] /* ty=Tensor[(1, 1, 512, 24), uint8] */, 0 /* ty=int32 */, 96 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.00617409f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 24), int32] */; %158 = nn.bias_add(%157, meta[relay.Constant][73] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 24), int32] */; qnn.requantize(%158, 0.000256396f /* ty=float32 */, 0 /* ty=int32 */, 0.0382293f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 24), uint8] */ }; %160 = %159(%146) /* ty=Tensor[(1, 14, 14, 24), uint8] */; %161 = fn (%FunctionVar_26_0: Tensor[(1, 14, 14, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %155 = qnn.conv2d(%FunctionVar_26_0, meta[relay.Constant][70] /* ty=Tensor[(3, 3, 24, 64), uint8] */, 0 /* ty=int32 */, 90 /* ty=int32 */, 0.0382293f /* ty=float32 */, 0.00926049f /* ty=float32 */, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %156 = nn.bias_add(%155, meta[relay.Constant][71] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%156, 0.000354022f /* ty=float32 */, 0 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %164 = nn.max_pool2d(%146, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 512), uint8] */; %165 = fn (%FunctionVar_25_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %162 = qnn.conv2d(%FunctionVar_25_0, meta[relay.Constant][74] /* ty=Tensor[(1, 1, 512, 64), uint8] */, 0 /* ty=int32 */, 133 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.00348826f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %163 = nn.bias_add(%162, meta[relay.Constant][75] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%163, 0.00014486f /* ty=float32 */, 0 /* ty=int32 */, 0.0225817f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %166 = %147(%146) /* ty=Tensor[(1, 14, 14, 128), uint8] */; %167 = %154(%153) /* ty=Tensor[(1, 14, 14, 256), uint8] */; %168 = %161(%160) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %169 = %165(%164) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %170 = (%166, %167, %168, %169); %171 = (0.0363159f /* ty=float32 */, 0.040194f /* ty=float32 */, 0.0679776f /* ty=float32 */, 0.0225817f /* ty=float32 */); %172 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %173 = qnn.concatenate(%170, %171, %172, 0.0679776f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 512), uint8] */; %174 = fn (%FunctionVar_24_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 112), uint8] { %12 = qnn.conv2d(%FunctionVar_24_0, meta[relay.Constant][8] /* ty=Tensor[(1, 1, 512, 112), uint8] */, 0 /* ty=int32 */, 131 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00541721f /* ty=float32 */, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 112), int32] */; %13 = nn.bias_add(%12, meta[relay.Constant][9] /* ty=Tensor[(112), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 112), int32] */; qnn.requantize(%13, 0.000368249f /* ty=float32 */, 0 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 112), uint8] */ }; %179 = fn (%FunctionVar_23_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 144), uint8] { %177 = qnn.conv2d(%FunctionVar_23_0, meta[relay.Constant][78] /* ty=Tensor[(1, 1, 512, 144), uint8] */, 0 /* ty=int32 */, 102 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00529131f /* ty=float32 */, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 144), int32] */; %178 = nn.bias_add(%177, meta[relay.Constant][79] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 144), int32] */; qnn.requantize(%178, 0.000359691f /* ty=float32 */, 0 /* ty=int32 */, 0.0464631f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 144), uint8] */ }; %180 = %179(%173) /* ty=Tensor[(1, 14, 14, 144), uint8] */; %181 = fn (%FunctionVar_22_0: Tensor[(1, 14, 14, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 288), uint8] { %175 = qnn.conv2d(%FunctionVar_22_0, meta[relay.Constant][76] /* ty=Tensor[(3, 3, 144, 288), uint8] */, 0 /* ty=int32 */, 121 /* ty=int32 */, 0.0464631f /* ty=float32 */, 0.00281512f /* ty=float32 */, padding=[1, 1, 1, 1], channels=288, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 288), int32] */; %176 = nn.bias_add(%175, meta[relay.Constant][77] /* ty=Tensor[(288), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 288), int32] */; qnn.requantize(%176, 0.000130799f /* ty=float32 */, 0 /* ty=int32 */, 0.0511231f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 288), uint8] */ }; %186 = fn (%FunctionVar_21_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 32), uint8] { %184 = qnn.conv2d(%FunctionVar_21_0, meta[relay.Constant][82] /* ty=Tensor[(1, 1, 512, 32), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00454161f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 32), int32] */; %185 = nn.bias_add(%184, meta[relay.Constant][83] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 32), int32] */; qnn.requantize(%185, 0.000308728f /* ty=float32 */, 0 /* ty=int32 */, 0.0439514f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 32), uint8] */ }; %187 = %186(%173) /* ty=Tensor[(1, 14, 14, 32), uint8] */; %188 = fn (%FunctionVar_20_0: Tensor[(1, 14, 14, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %182 = qnn.conv2d(%FunctionVar_20_0, meta[relay.Constant][80] /* ty=Tensor[(3, 3, 32, 64), uint8] */, 0 /* ty=int32 */, 92 /* ty=int32 */, 0.0439514f /* ty=float32 */, 0.00496321f /* ty=float32 */, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %183 = nn.bias_add(%182, meta[relay.Constant][81] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%183, 0.00021814f /* ty=float32 */, 0 /* ty=int32 */, 0.0310861f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %191 = nn.max_pool2d(%173, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 512), uint8] */; %192 = fn (%FunctionVar_19_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %189 = qnn.conv2d(%FunctionVar_19_0, meta[relay.Constant][84] /* ty=Tensor[(1, 1, 512, 64), uint8] */, 0 /* ty=int32 */, 124 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00317437f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %190 = nn.bias_add(%189, meta[relay.Constant][85] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%190, 0.000215786f /* ty=float32 */, 0 /* ty=int32 */, 0.024479f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %193 = %174(%173) /* ty=Tensor[(1, 14, 14, 112), uint8] */; %194 = %181(%180) /* ty=Tensor[(1, 14, 14, 288), uint8] */; %195 = %188(%187) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %196 = %192(%191) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %197 = (%193, %194, %195, %196); %198 = (0.0520244f /* ty=float32 */, 0.0511231f /* ty=float32 */, 0.0310861f /* ty=float32 */, 0.024479f /* ty=float32 */); %199 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %200 = qnn.concatenate(%197, %198, %199, 0.0520244f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 528), uint8] */; %201 = fn (%FunctionVar_18_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 256), uint8] { %10 = qnn.conv2d(%FunctionVar_18_0, meta[relay.Constant][6] /* ty=Tensor[(1, 1, 528, 256), uint8] */, 0 /* ty=int32 */, 118 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00557758f /* ty=float32 */, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 256), int32] */; %11 = nn.bias_add(%10, meta[relay.Constant][7] /* ty=Tensor[(256), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 256), int32] */; qnn.requantize(%11, 0.00029017f /* ty=float32 */, 0 /* ty=int32 */, 0.0461338f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 256), uint8] */ }; %206 = fn (%FunctionVar_17_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 160), uint8] { %204 = qnn.conv2d(%FunctionVar_17_0, meta[relay.Constant][88] /* ty=Tensor[(1, 1, 528, 160), uint8] */, 0 /* ty=int32 */, 105 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00543337f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 160), int32] */; %205 = nn.bias_add(%204, meta[relay.Constant][89] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 160), int32] */; qnn.requantize(%205, 0.000282668f /* ty=float32 */, 0 /* ty=int32 */, 0.0368424f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 160), uint8] */ }; %207 = %206(%200) /* ty=Tensor[(1, 14, 14, 160), uint8] */; %208 = fn (%FunctionVar_16_0: Tensor[(1, 14, 14, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 320), uint8] { %202 = qnn.conv2d(%FunctionVar_16_0, meta[relay.Constant][86] /* ty=Tensor[(3, 3, 160, 320), uint8] */, 0 /* ty=int32 */, 85 /* ty=int32 */, 0.0368424f /* ty=float32 */, 0.00295774f /* ty=float32 */, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 320), int32] */; %203 = nn.bias_add(%202, meta[relay.Constant][87] /* ty=Tensor[(320), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 320), int32] */; qnn.requantize(%203, 0.00010897f /* ty=float32 */, 0 /* ty=int32 */, 0.0384801f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 320), uint8] */ }; %213 = fn (%FunctionVar_15_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 32), uint8] { %211 = qnn.conv2d(%FunctionVar_15_0, meta[relay.Constant][92] /* ty=Tensor[(1, 1, 528, 32), uint8] */, 0 /* ty=int32 */, 126 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00506661f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 32), int32] */; %212 = nn.bias_add(%211, meta[relay.Constant][93] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 32), int32] */; qnn.requantize(%212, 0.000263587f /* ty=float32 */, 0 /* ty=int32 */, 0.0576595f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 32), uint8] */ }; %214 = %213(%200) /* ty=Tensor[(1, 14, 14, 32), uint8] */; %215 = fn (%FunctionVar_14_0: Tensor[(1, 14, 14, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] { %209 = qnn.conv2d(%FunctionVar_14_0, meta[relay.Constant][90] /* ty=Tensor[(3, 3, 32, 128), uint8] */, 0 /* ty=int32 */, 81 /* ty=int32 */, 0.0576595f /* ty=float32 */, 0.00359061f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */; %210 = nn.bias_add(%209, meta[relay.Constant][91] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */; qnn.requantize(%210, 0.000207033f /* ty=float32 */, 0 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */ }; %218 = nn.max_pool2d(%200, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 528), uint8] */; %219 = fn (%FunctionVar_13_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] { %216 = qnn.conv2d(%FunctionVar_13_0, meta[relay.Constant][94] /* ty=Tensor[(1, 1, 528, 128), uint8] */, 0 /* ty=int32 */, 94 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00317797f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */; %217 = nn.bias_add(%216, meta[relay.Constant][95] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */; qnn.requantize(%217, 0.000165332f /* ty=float32 */, 0 /* ty=int32 */, 0.0265916f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */ }; %220 = %201(%200) /* ty=Tensor[(1, 14, 14, 256), uint8] */; %221 = %208(%207) /* ty=Tensor[(1, 14, 14, 320), uint8] */; %222 = %215(%214) /* ty=Tensor[(1, 14, 14, 128), uint8] */; %223 = %219(%218) /* ty=Tensor[(1, 14, 14, 128), uint8] */; %224 = (%220, %221, %222, %223); %225 = (0.0461338f /* ty=float32 */, 0.0384801f /* ty=float32 */, 0.0713473f /* ty=float32 */, 0.0265916f /* ty=float32 */); %226 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %227 = qnn.concatenate(%224, %225, %226, 0.0713473f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 832), uint8] */; %228 = nn.max_pool2d(%227, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 0, 0], layout="NHWC") /* ty=Tensor[(1, 7, 7, 832), uint8] */; %229 = fn (%FunctionVar_12_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 256), uint8] { %8 = qnn.conv2d(%FunctionVar_12_0, meta[relay.Constant][4] /* ty=Tensor[(1, 1, 832, 256), uint8] */, 0 /* ty=int32 */, 182 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.0104061f /* ty=float32 */, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 256), int32] */; %9 = nn.bias_add(%8, meta[relay.Constant][5] /* ty=Tensor[(256), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 256), int32] */; qnn.requantize(%9, 0.000742446f /* ty=float32 */, 0 /* ty=int32 */, 0.036752f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 256), uint8] */ }; %234 = fn (%FunctionVar_11_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] { %232 = qnn.conv2d(%FunctionVar_11_0, meta[relay.Constant][98] /* ty=Tensor[(1, 1, 832, 160), uint8] */, 0 /* ty=int32 */, 115 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.00596868f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */; %233 = nn.bias_add(%232, meta[relay.Constant][99] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */; qnn.requantize(%233, 0.000425849f /* ty=float32 */, 0 /* ty=int32 */, 0.0490709f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */ }; %235 = %234(%228) /* ty=Tensor[(1, 7, 7, 160), uint8] */; %236 = fn (%FunctionVar_10_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 320), uint8] { %230 = qnn.conv2d(%FunctionVar_10_0, meta[relay.Constant][96] /* ty=Tensor[(3, 3, 160, 320), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0490709f /* ty=float32 */, 0.00293286f /* ty=float32 */, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 320), int32] */; %231 = nn.bias_add(%230, meta[relay.Constant][97] /* ty=Tensor[(320), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 320), int32] */; qnn.requantize(%231, 0.000143918f /* ty=float32 */, 0 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 320), uint8] */ }; %241 = fn (%FunctionVar_9_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 32), uint8] { %239 = qnn.conv2d(%FunctionVar_9_0, meta[relay.Constant][102] /* ty=Tensor[(1, 1, 832, 32), uint8] */, 0 /* ty=int32 */, 122 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.00383815f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 32), int32] */; %240 = nn.bias_add(%239, meta[relay.Constant][103] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 32), int32] */; qnn.requantize(%240, 0.000273842f /* ty=float32 */, 0 /* ty=int32 */, 0.0411565f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 32), uint8] */ }; %242 = %241(%228) /* ty=Tensor[(1, 7, 7, 32), uint8] */; %243 = fn (%FunctionVar_8_0: Tensor[(1, 7, 7, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] { %237 = qnn.conv2d(%FunctionVar_8_0, meta[relay.Constant][100] /* ty=Tensor[(3, 3, 32, 128), uint8] */, 0 /* ty=int32 */, 102 /* ty=int32 */, 0.0411565f /* ty=float32 */, 0.002763f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */; %238 = nn.bias_add(%237, meta[relay.Constant][101] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */; qnn.requantize(%238, 0.000113715f /* ty=float32 */, 0 /* ty=int32 */, 0.0371453f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */ }; %246 = nn.max_pool2d(%228, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 7, 7, 832), uint8] */; %247 = fn (%FunctionVar_7_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] { %244 = qnn.conv2d(%FunctionVar_7_0, meta[relay.Constant][104] /* ty=Tensor[(1, 1, 832, 128), uint8] */, 0 /* ty=int32 */, 123 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.00247852f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */; %245 = nn.bias_add(%244, meta[relay.Constant][105] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */; qnn.requantize(%245, 0.000176836f /* ty=float32 */, 0 /* ty=int32 */, 0.0213327f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */ }; %248 = %229(%228) /* ty=Tensor[(1, 7, 7, 256), uint8] */; %249 = %236(%235) /* ty=Tensor[(1, 7, 7, 320), uint8] */; %250 = %243(%242) /* ty=Tensor[(1, 7, 7, 128), uint8] */; %251 = %247(%246) /* ty=Tensor[(1, 7, 7, 128), uint8] */; %252 = (%248, %249, %250, %251); %253 = (0.036752f /* ty=float32 */, 0.0450282f /* ty=float32 */, 0.0371453f /* ty=float32 */, 0.0213327f /* ty=float32 */); %254 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %255 = qnn.concatenate(%252, %253, %254, 0.0450282f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 7, 7, 832), uint8] */; %256 = fn (%FunctionVar_6_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 384), uint8] { %6 = qnn.conv2d(%FunctionVar_6_0, meta[relay.Constant][2] /* ty=Tensor[(1, 1, 832, 384), uint8] */, 0 /* ty=int32 */, 104 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.0143784f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 384), int32] */; %7 = nn.bias_add(%6, meta[relay.Constant][3] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 384), int32] */; qnn.requantize(%7, 0.000647432f /* ty=float32 */, 0 /* ty=int32 */, 0.0470831f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 384), uint8] */ }; %261 = fn (%FunctionVar_5_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 192), uint8] { %259 = qnn.conv2d(%FunctionVar_5_0, meta[relay.Constant][108] /* ty=Tensor[(1, 1, 832, 192), uint8] */, 0 /* ty=int32 */, 81 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.00580293f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 192), int32] */; %260 = nn.bias_add(%259, meta[relay.Constant][109] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 192), int32] */; qnn.requantize(%260, 0.000261295f /* ty=float32 */, 0 /* ty=int32 */, 0.0443907f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 192), uint8] */ }; %262 = %261(%255) /* ty=Tensor[(1, 7, 7, 192), uint8] */; %263 = fn (%FunctionVar_4_0: Tensor[(1, 7, 7, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 384), uint8] { %257 = qnn.conv2d(%FunctionVar_4_0, meta[relay.Constant][106] /* ty=Tensor[(3, 3, 192, 384), uint8] */, 0 /* ty=int32 */, 83 /* ty=int32 */, 0.0443907f /* ty=float32 */, 0.00505402f /* ty=float32 */, padding=[1, 1, 1, 1], channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 384), int32] */; %258 = nn.bias_add(%257, meta[relay.Constant][107] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 384), int32] */; qnn.requantize(%258, 0.000224351f /* ty=float32 */, 0 /* ty=int32 */, 0.0483342f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 384), uint8] */ }; %268 = fn (%FunctionVar_3_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 48), uint8] { %266 = qnn.conv2d(%FunctionVar_3_0, meta[relay.Constant][112] /* ty=Tensor[(1, 1, 832, 48), uint8] */, 0 /* ty=int32 */, 87 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.00578726f /* ty=float32 */, padding=[0, 0, 0, 0], channels=48, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 48), int32] */; %267 = nn.bias_add(%266, meta[relay.Constant][113] /* ty=Tensor[(48), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 48), int32] */; qnn.requantize(%267, 0.00026059f /* ty=float32 */, 0 /* ty=int32 */, 0.0431175f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 48), uint8] */ }; %269 = %268(%255) /* ty=Tensor[(1, 7, 7, 48), uint8] */; %270 = fn (%FunctionVar_2_0: Tensor[(1, 7, 7, 48), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] { %264 = qnn.conv2d(%FunctionVar_2_0, meta[relay.Constant][110] /* ty=Tensor[(3, 3, 48, 128), uint8] */, 0 /* ty=int32 */, 74 /* ty=int32 */, 0.0431175f /* ty=float32 */, 0.00680263f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */; %265 = nn.bias_add(%264, meta[relay.Constant][111] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */; qnn.requantize(%265, 0.000293312f /* ty=float32 */, 0 /* ty=int32 */, 0.0535589f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */ }; %273 = nn.max_pool2d(%255, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 7, 7, 832), uint8] */; %274 = fn (%FunctionVar_1_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] { %271 = qnn.conv2d(%FunctionVar_1_0, meta[relay.Constant][114] /* ty=Tensor[(1, 1, 832, 128), uint8] */, 0 /* ty=int32 */, 62 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.0055094f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */; %272 = nn.bias_add(%271, meta[relay.Constant][115] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */; qnn.requantize(%272, 0.000248078f /* ty=float32 */, 0 /* ty=int32 */, 0.0320987f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */ }; %275 = %256(%255) /* ty=Tensor[(1, 7, 7, 384), uint8] */; %276 = %263(%262) /* ty=Tensor[(1, 7, 7, 384), uint8] */; %277 = %270(%269) /* ty=Tensor[(1, 7, 7, 128), uint8] */; %278 = %274(%273) /* ty=Tensor[(1, 7, 7, 128), uint8] */; %279 = (%275, %276, %277, %278); %280 = (0.0470831f /* ty=float32 */, 0.0483342f /* ty=float32 */, 0.0535589f /* ty=float32 */, 0.0320987f /* ty=float32 */); %281 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */); %282 = qnn.concatenate(%279, %280, %281, 0.0535589f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 7, 7, 1024), uint8] */; %283 = fn (%FunctionVar_0_02: Tensor[(1, 7, 7, 1024), uint8], PartitionedFromPattern="cast_nn.avg_pool2d_cast_", Composite="vsi_npu.qnn_avgpool2d") -> Tensor[(1, 1, 1, 1024), uint8] { %4 = cast(%FunctionVar_0_02, dtype="int32") /* ty=Tensor[(1, 7, 7, 1024), int32] */; %5 = nn.avg_pool2d(%4, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC") /* ty=Tensor[(1, 1, 1, 1024), int32] */; cast(%5, dtype="uint8") /* ty=Tensor[(1, 1, 1, 1024), uint8] */ }; %284 = %283(%282) /* ty=Tensor[(1, 1, 1, 1024), uint8] */; %285 = fn (%FunctionVar_0_01: Tensor[(1, 1, 1, 1024), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 1, 1, 1001), uint8] { %2 = qnn.conv2d(%FunctionVar_0_01, meta[relay.Constant][0] /* ty=Tensor[(1, 1, 1024, 1001), uint8] */, 0 /* ty=int32 */, 106 /* ty=int32 */, 0.0535589f /* ty=float32 */, 0.00235748f /* ty=float32 */, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 1, 1, 1001), int32] */; %3 = nn.bias_add(%2, meta[relay.Constant][1] /* ty=Tensor[(1001), int32] */, axis=3) /* ty=Tensor[(1, 1, 1, 1001), int32] */; qnn.requantize(%3, 0.000126264f /* ty=float32 */, 0 /* ty=int32 */, 0.0962827f /* ty=float32 */, 60 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 1, 1, 1001), uint8] */ }; %286 = %285(%284) /* ty=Tensor[(1, 1, 1, 1001), uint8] */; %287 = reshape(%286, newshape=[-1, 1001]) /* ty=Tensor[(1, 1001), uint8] */; %288 = fn (%FunctionVar_0_0: Tensor[(1, 1001), uint8], PartitionedFromPattern="qnn.dequantize_nn.softmax_qnn.quantize_", Composite="vsi_npu.qnn_softmax") -> Tensor[(1, 1001), uint8] { %0 = qnn.dequantize(%FunctionVar_0_0, 0.0962827f /* ty=float32 */, 60 /* ty=int32 */) /* ty=Tensor[(1, 1001), float32] */; %1 = nn.softmax(%0, axis=1) /* ty=Tensor[(1, 1001), float32] */; qnn.quantize(%1, 0.00390625f /* ty=float32 */, 0 /* ty=int32 */, out_dtype="uint8") /* ty=Tensor[(1, 1001), uint8] */ }; %288(%287) /* ty=Tensor[(1, 1001), uint8] */ } This is important----> name_node.value() == tvmgen_default_vsi_npu_0 GraphMakerImpl::Create TensorMakerImpl::InferCall: vsi_npu.qnn_softmax TensorMakerImpl::InferCall: reshape TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 W [HandleLayoutInfer:268]Op 162: default layout inference pass. VsiNpuModule::GetFunction: get_symbol VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early INCtest_vsi_tflite_model_all.py:120: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the new recommended usage. graph, lib, params = relay.build(mod, target, params=params) VsiNpuModule::SaveToBinary SaveToBinary: nbg size = 6884160 SaveToBinary: input size = 1 SaveToBinary: output size = 1 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule::SaveToBinary2 Printing device code to device_code.cl... VsiNpuModule::LoadFromBinary LoadFromBinary: nbg size = 6884160 LoadFromBinary: input size = 1 LoadFromBinary: output size = 1 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 (1, 224, 224, 3) ############ VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0 Process Graph: 7 ms or 7232 us VsiNpuModule::GetFunction: size: 2 [[ 0 0 254 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] ```
@sunshinemyson https://drive.google.com/drive/folders/1iX8GyEWZbzoAdiuW0AcU-4pApwhtldFp?usp=share_link
So far, I have tried executing MobileNetV2 and SqueezeNet TFLite models with TVM RPC or performing local compilation on Khadas VIM3 Pro, and both resulted in normal output.
Here is the output when running test_vsi_pytorch_model_all.py with the quantized tflite model MobileNetV2.
``` #productname=VSI SIMULATOR, pid=0x88 1. press any key and continue... vsi_npu.py --> qnn.dequantize vsi_npu.py --> nn.softmax vsi_npu.py --> qnn.quantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.avg_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> reshape This is important----> name_node.value() == tvmgen_default_vsi_npu_0 GraphMakerImpl::Create graph gpuCount=1 interConnectRingCount=0 NN ring buffer is disabled TensorMakerImpl::InferCall: vsi_npu.qnn_softmax TensorMakerImpl::InferCall: reshape TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d graph gpuCount=1 interConnectRingCount=0 NN ring buffer is disabled W [HandleLayoutInfer:268]Op 162: default layout inference pass. ---------------------------Begin VerifyTiling ------------------------- AXI-SRAM = 1048320 Bytes VIP-SRAM = 522240 Bytes SWTILING_PHASE_FEATURES[0, 0, 0] 0 TP [( 3 224 224 1, 150528, 0x0x2bedf70(0x0x2bedf70, 0x(nil)) -> 224 224 3 1, 150528, 0x0x2db3860(0x0x2db3860, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(1 1, 1 1)] C[ 1] 1 TP [( 224 224 3 1, 150528, 0x0x2db3860(0x0x2db3860, 0x(nil)) -> 113 113 12 1, 153228, 0x0x49dcd70(0x0x49dcd70, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(1 1, 1 1)] P[ 0] C[ 2] 2 NN [( 113 113 12 1, 153228, 0x0x49dcd70(0x0x49dcd70, 0x(nil)) -> 112 112 32 1, 401408, 0x0x2db4fe0(0x0x2db4fe0, 0x(nil))) k(2 2 12, 1792) pad(0 0) pool(1 1, 1 1)] P[ 1] C[ 3] 3 NN [( 112 112 32 1, 401408, 0x0x2db4fe0(0x0x2db4fe0, 0x(nil)) -> 112 112 32 1, 401408, 0x0x2db82b0(0x0x2db82b0, 0x(nil))) k(3 3 32, 11392) pad(1 1) pool(1 1, 1 1)] P[ 2] C[ 4] 4 NN [( 112 112 32 1, 401408, 0x0x2db82b0(0x0x2db82b0, 0x(nil)) -> 112 112 16 1, 200704, 0x0x2dbbc10(0x0x2dbbc10, 0x(nil))) k(1 1 32, 640) pad(0 0) pool(1 1, 1 1)] P[ 3] C[ 5] 5 NN [( 112 112 16 1, 200704, 0x0x2dbbc10(0x0x2dbbc10, 0x(nil)) -> 112 112 96 1, 1204224, 0x0x2dbf550(0x0x2dbf550, 0x(nil))) k(1 1 16, 2048) pad(0 0) pool(1 1, 1 1)] P[ 4] C[ 6] 6 NN [( 112 112 96 1, 1204224, 0x0x2dbf550(0x0x2dbf550, 0x(nil)) -> 56 56 96 1, 301056, 0x0x2dc2ed0(0x0x2dc2ed0, 0x(nil))) k(3 3 96, 103040) pad(0 0) pool(2 2, 2 2)] P[ 5] C[ 7] 7 NN [( 56 56 96 1, 301056, 0x0x2dc2ed0(0x0x2dc2ed0, 0x(nil)) -> 56 56 24 1, 75264, 0x0xa8053b0(0x0x30953b0, 0x0x12600)) k(1 1 96, 2560) pad(0 0) pool(1 1, 1 1)] P[ 6] C[ 8, 11] 8 NN [( 56 56 24 1, 75264, 0x0xa8053b0(0x0x30953b0, 0x0x12600) -> 56 56 144 1, 451584, 0x0x2dca7a0(0x0x2dca7a0, 0x(nil))) k(1 1 24, 4352) pad(0 0) pool(1 1, 1 1)] P[ 7] C[ 9] 9 NN [( 56 56 144 1, 451584, 0x0x2dca7a0(0x0x2dca7a0, 0x(nil)) -> 56 56 144 1, 451584, 0x0x2dce100(0x0x2dce100, 0x(nil))) k(3 3 144, 231936) pad(1 1) pool(1 1, 1 1)] P[ 8] C[ 10] 10 NN [( 56 56 144 1, 451584, 0x0x2dce100(0x0x2dce100, 0x(nil)) -> 56 56 24 1, 75264, 0x0x30953b0(0x0x30953b0, 0x(nil))) k(1 1 144, 3840) pad(0 0) pool(1 1, 1 1)] P[ 9] C[ 11] 11 NN [( 128 588 2 1, 150528, 0x0x30953b0(0x0x30953b0, 0x(nil)) -> 128 588 1 1, 75264, 0x0x2dd53b0(0x0x2dd53b0, 0x(nil))) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 7, 10] C[ 12] 12 NN [( 56 56 24 1, 75264, 0x0x2dd53b0(0x0x2dd53b0, 0x(nil)) -> 56 56 144 1, 451584, 0x0x2dd63c0(0x0x2dd63c0, 0x(nil))) k(1 1 24, 4352) pad(0 0) pool(1 1, 1 1)] P[ 11] C[ 13] 13 NN [( 56 56 144 1, 451584, 0x0x2dd63c0(0x0x2dd63c0, 0x(nil)) -> 28 28 144 1, 112896, 0x0x2dd9e30(0x0x2dd9e30, 0x(nil))) k(3 3 144, 231808) pad(0 0) pool(2 2, 2 2)] P[ 12] C[ 14] 14 NN [( 28 28 144 1, 112896, 0x0x2dd9e30(0x0x2dd9e30, 0x(nil)) -> 28 28 32 1, 25088, 0x0x586b1b0(0x0x309b1b0, 0x0x6200)) k(1 1 144, 4992) pad(0 0) pool(1 1, 1 1)] P[ 13] C[ 15, 18] 15 NN [( 28 28 32 1, 25088, 0x0x586b1b0(0x0x309b1b0, 0x0x6200) -> 28 28 192 1, 150528, 0x0x2de1370(0x0x2de1370, 0x(nil))) k(1 1 32, 7296) pad(0 0) pool(1 1, 1 1)] P[ 14] C[ 16] 16 NN [( 28 28 192 1, 150528, 0x0x2de1370(0x0x2de1370, 0x(nil)) -> 28 28 192 1, 150528, 0x0x2de4e20(0x0x2de4e20, 0x(nil))) k(3 3 192, 412544) pad(1 1) pool(1 1, 1 1)] P[ 15] C[ 17] 17 NN [( 28 28 192 1, 150528, 0x0x2de4e20(0x0x2de4e20, 0x(nil)) -> 28 28 32 1, 25088, 0x0x309b1b0(0x0x309b1b0, 0x(nil))) k(1 1 192, 6656) pad(0 0) pool(1 1, 1 1)] P[ 16] C[ 18] 18 NN [( 128 196 2 1, 50176, 0x0x309b1b0(0x0x309b1b0, 0x(nil)) -> 128 196 1 1, 25088, 0x0x5871110(0x0x30a1110, 0x0x6200)) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 14, 17] C[ 19, 22] 19 NN [( 28 28 32 1, 25088, 0x0x5871110(0x0x30a1110, 0x0x6200) -> 28 28 192 1, 150528, 0x0x2ded700(0x0x2ded700, 0x(nil))) k(1 1 32, 7296) pad(0 0) pool(1 1, 1 1)] P[ 18] C[ 20] 20 NN [( 28 28 192 1, 150528, 0x0x2ded700(0x0x2ded700, 0x(nil)) -> 28 28 192 1, 150528, 0x0x2df0d50(0x0x2df0d50, 0x(nil))) k(3 3 192, 412544) pad(1 1) pool(1 1, 1 1)] P[ 19] C[ 21] 21 NN [( 28 28 192 1, 150528, 0x0x2df0d50(0x0x2df0d50, 0x(nil)) -> 28 28 32 1, 25088, 0x0x30a1110(0x0x30a1110, 0x(nil))) k(1 1 192, 6656) pad(0 0) pool(1 1, 1 1)] P[ 20] C[ 22] 22 NN [( 128 196 2 1, 50176, 0x0x30a1110(0x0x30a1110, 0x(nil)) -> 128 196 1 1, 25088, 0x0x2df7bd0(0x0x2df7bd0, 0x(nil))) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 18, 21] C[ 23] 23 NN [( 28 28 32 1, 25088, 0x0x2df7bd0(0x0x2df7bd0, 0x(nil)) -> 28 28 192 1, 150528, 0x0x2df8cc0(0x0x2df8cc0, 0x(nil))) k(1 1 32, 7296) pad(0 0) pool(1 1, 1 1)] P[ 22] C[ 24] 24 NN [( 28 28 192 1, 150528, 0x0x2df8cc0(0x0x2df8cc0, 0x(nil)) -> 14 14 192 1, 37632, 0x0x2dfcd30(0x0x2dfcd30, 0x(nil))) k(3 3 192, 412544) pad(0 0) pool(2 2, 2 2)] P[ 23] C[ 25] 25 NN [( 14 14 192 1, 37632, 0x0x2dfcd30(0x0x2dfcd30, 0x(nil)) -> 14 14 64 1, 12544, 0x0x448f070(0x0x30a7070, 0x0x3100)) k(1 1 192, 13184) pad(0 0) pool(1 1, 1 1)] P[ 24] C[ 26, 29] 26 NN [( 14 14 64 1, 12544, 0x0x448f070(0x0x30a7070, 0x0x3100) -> 14 14 384 1, 75264, 0x0x2e04ac0(0x0x2e04ac0, 0x(nil))) k(1 1 64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 25] C[ 27] 27 NN [( 14 14 384 1, 75264, 0x0x2e04ac0(0x0x2e04ac0, 0x(nil)) -> 14 14 384 1, 75264, 0x0x2e08a20(0x0x2e08a20, 0x(nil))) k(3 3 384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 26] C[ 28] 28 NN [( 14 14 384 1, 75264, 0x0x2e08a20(0x0x2e08a20, 0x(nil)) -> 14 14 64 1, 12544, 0x0x30a7070(0x0x30a7070, 0x(nil))) k(1 1 384, 26112) pad(0 0) pool(1 1, 1 1)] P[ 27] C[ 29] 29 NN [( 128 98 2 1, 25088, 0x0x30a7070(0x0x30a7070, 0x(nil)) -> 128 98 1 1, 12544, 0x0x4494fd0(0x0x30acfd0, 0x0x3100)) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 25, 28] C[ 30, 33] 30 NN [( 14 14 64 1, 12544, 0x0x4494fd0(0x0x30acfd0, 0x0x3100) -> 14 14 384 1, 75264, 0x0x2e11970(0x0x2e11970, 0x(nil))) k(1 1 64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 31] 31 NN [( 14 14 384 1, 75264, 0x0x2e11970(0x0x2e11970, 0x(nil)) -> 14 14 384 1, 75264, 0x0x2e15b30(0x0x2e15b30, 0x(nil))) k(3 3 384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 30] C[ 32] 32 NN [( 14 14 384 1, 75264, 0x0x2e15b30(0x0x2e15b30, 0x(nil)) -> 14 14 64 1, 12544, 0x0x30acfd0(0x0x30acfd0, 0x(nil))) k(1 1 384, 26112) pad(0 0) pool(1 1, 1 1)] P[ 31] C[ 33] 33 NN [( 128 98 2 1, 25088, 0x0x30acfd0(0x0x30acfd0, 0x(nil)) -> 128 98 1 1, 12544, 0x0x449af30(0x0x30b2f30, 0x0x3100)) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 29, 32] C[ 34, 37] 34 NN [( 14 14 64 1, 12544, 0x0x449af30(0x0x30b2f30, 0x0x3100) -> 14 14 384 1, 75264, 0x0x2efd220(0x0x2efd220, 0x(nil))) k(1 1 64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 33] C[ 35] 35 NN [( 14 14 384 1, 75264, 0x0x2efd220(0x0x2efd220, 0x(nil)) -> 14 14 384 1, 75264, 0x0x2f013b0(0x0x2f013b0, 0x(nil))) k(3 3 384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 34] C[ 36] 36 NN [( 14 14 384 1, 75264, 0x0x2f013b0(0x0x2f013b0, 0x(nil)) -> 14 14 64 1, 12544, 0x0x30b2f30(0x0x30b2f30, 0x(nil))) k(1 1 384, 26112) pad(0 0) pool(1 1, 1 1)] P[ 35] C[ 37] 37 NN [( 128 98 2 1, 25088, 0x0x30b2f30(0x0x30b2f30, 0x(nil)) -> 128 98 1 1, 12544, 0x0x2f09240(0x0x2f09240, 0x(nil))) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 33, 36] C[ 38] 38 NN [( 14 14 64 1, 12544, 0x0x2f09240(0x0x2f09240, 0x(nil)) -> 14 14 384 1, 75264, 0x0x2f0a330(0x0x2f0a330, 0x(nil))) k(1 1 64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 37] C[ 39] 39 NN [( 14 14 384 1, 75264, 0x0x2f0a330(0x0x2f0a330, 0x(nil)) -> 14 14 384 1, 75264, 0x0x2f0e4c0(0x0x2f0e4c0, 0x(nil))) k(3 3 384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 38] C[ 40] 40 NN [( 14 14 384 1, 75264, 0x0x2f0e4c0(0x0x2f0e4c0, 0x(nil)) -> 14 14 96 1, 18816, 0x0x4e94eb0(0x0x30b8eb0, 0x0x4980)) k(1 1 384, 39168) pad(0 0) pool(1 1, 1 1)] P[ 39] C[ 41, 44] 41 NN [( 14 14 96 1, 18816, 0x0x4e94eb0(0x0x30b8eb0, 0x0x4980) -> 14 14 576 1, 112896, 0x0x2f16370(0x0x2f16370, 0x(nil))) k(1 1 96, 60544) pad(0 0) pool(1 1, 1 1)] P[ 40] C[ 42] 42 NN [( 14 14 576 1, 112896, 0x0x2f16370(0x0x2f16370, 0x(nil)) -> 14 14 576 1, 112896, 0x0x2f1a2a0(0x0x2f1a2a0, 0x(nil))) k(3 3 576, 3712896) pad(1 1) pool(1 1, 1 1)] P[ 41] C[ 43] 43 NN [( 14 14 576 1, 112896, 0x0x2f1a2a0(0x0x2f1a2a0, 0x(nil)) -> 14 14 96 1, 18816, 0x0x30b8eb0(0x0x30b8eb0, 0x(nil))) k(1 1 576, 58496) pad(0 0) pool(1 1, 1 1)] P[ 42] C[ 44] 44 NN [( 128 147 2 1, 37632, 0x0x30b8eb0(0x0x30b8eb0, 0x(nil)) -> 128 147 1 1, 18816, 0x0x4e9ae30(0x0x30bee30, 0x0x4980)) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 40, 43] C[ 45, 48] 45 NN [( 14 14 96 1, 18816, 0x0x4e9ae30(0x0x30bee30, 0x0x4980) -> 14 14 576 1, 112896, 0x0x2f23140(0x0x2f23140, 0x(nil))) k(1 1 96, 60544) pad(0 0) pool(1 1, 1 1)] P[ 44] C[ 46] 46 NN [( 14 14 576 1, 112896, 0x0x2f23140(0x0x2f23140, 0x(nil)) -> 14 14 576 1, 112896, 0x0x2f273a0(0x0x2f273a0, 0x(nil))) k(3 3 576, 3712896) pad(1 1) pool(1 1, 1 1)] P[ 45] C[ 47] 47 NN [( 14 14 576 1, 112896, 0x0x2f273a0(0x0x2f273a0, 0x(nil)) -> 14 14 96 1, 18816, 0x0x30bee30(0x0x30bee30, 0x(nil))) k(1 1 576, 58496) pad(0 0) pool(1 1, 1 1)] P[ 46] C[ 48] 48 NN [( 128 147 2 1, 37632, 0x0x30bee30(0x0x30bee30, 0x(nil)) -> 128 147 1 1, 18816, 0x0x2f2f230(0x0x2f2f230, 0x(nil))) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 44, 47] C[ 49] 49 NN [( 14 14 96 1, 18816, 0x0x2f2f230(0x0x2f2f230, 0x(nil)) -> 14 14 576 1, 112896, 0x0x2f30240(0x0x2f30240, 0x(nil))) k(1 1 96, 60544) pad(0 0) pool(1 1, 1 1)] P[ 48] C[ 50] 50 NN [( 14 14 576 1, 112896, 0x0x2f30240(0x0x2f30240, 0x(nil)) -> 7 7 576 1, 28224, 0x0x2f344c0(0x0x2f344c0, 0x(nil))) k(3 3 576, 3712768) pad(0 0) pool(2 2, 2 2)] P[ 49] C[ 51] 51 NN [( 7 7 576 1, 28224, 0x0x2f344c0(0x0x2f344c0, 0x(nil)) -> 7 7 160 1, 7840, 0x0x3d35db0(0x0x30c4db0, 0x0x1ea0)) k(1 1 576, 97536) pad(0 0) pool(1 1, 1 1)] P[ 50] C[ 52, 55] 52 NN [( 7 7 160 1, 7840, 0x0x3d35db0(0x0x30c4db0, 0x0x1ea0) -> 7 7 960 1, 47040, 0x0x2f3c370(0x0x2f3c370, 0x(nil))) k(1 1 160, 165376) pad(0 0) pool(1 1, 1 1)] P[ 51] C[ 53] 53 NN [( 7 7 960 1, 47040, 0x0x2f3c370(0x0x2f3c370, 0x(nil)) -> 7 7 960 1, 47040, 0x0x2f402d0(0x0x2f402d0, 0x(nil))) k(3 3 960, 10315648) pad(1 1) pool(1 1, 1 1)] P[ 52] C[ 54] 54 NN [( 7 7 960 1, 47040, 0x0x2f402d0(0x0x2f402d0, 0x(nil)) -> 7 7 160 1, 7840, 0x0x30c4db0(0x0x30c4db0, 0x(nil))) k(1 1 960, 162048) pad(0 0) pool(1 1, 1 1)] P[ 53] C[ 55] 55 NN [( 32 245 2 1, 15680, 0x0x30c4db0(0x0x30c4db0, 0x(nil)) -> 32 245 1 1, 7840, 0x0x3d3bd20(0x0x30cad20, 0x0x1ea0)) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 51, 54] C[ 56, 59] 56 NN [( 7 7 160 1, 7840, 0x0x3d3bd20(0x0x30cad20, 0x0x1ea0) -> 7 7 960 1, 47040, 0x0x2f49260(0x0x2f49260, 0x(nil))) k(1 1 160, 165376) pad(0 0) pool(1 1, 1 1)] P[ 55] C[ 57] 57 NN [( 7 7 960 1, 47040, 0x0x2f49260(0x0x2f49260, 0x(nil)) -> 7 7 960 1, 47040, 0x0x2f4d400(0x0x2f4d400, 0x(nil))) k(3 3 960, 10315648) pad(1 1) pool(1 1, 1 1)] P[ 56] C[ 58] 58 NN [( 7 7 960 1, 47040, 0x0x2f4d400(0x0x2f4d400, 0x(nil)) -> 7 7 160 1, 7840, 0x0x30cad20(0x0x30cad20, 0x(nil))) k(1 1 960, 162048) pad(0 0) pool(1 1, 1 1)] P[ 57] C[ 59] 59 NN [( 32 245 2 1, 15680, 0x0x30cad20(0x0x30cad20, 0x(nil)) -> 32 245 1 1, 7840, 0x0x2f552b0(0x0x2f552b0, 0x(nil))) k(1 1 2, 128) pad(0 0) pool(1 1, 1 1)] P[ 55, 58] C[ 60] 60 NN [( 7 7 160 1, 7840, 0x0x2f552b0(0x0x2f552b0, 0x(nil)) -> 7 7 960 1, 47040, 0x0x2f562c0(0x0x2f562c0, 0x(nil))) k(1 1 160, 165376) pad(0 0) pool(1 1, 1 1)] P[ 59] C[ 61] 61 NN [( 7 7 960 1, 47040, 0x0x2f562c0(0x0x2f562c0, 0x(nil)) -> 7 7 960 1, 47040, 0x0x2f5a550(0x0x2f5a550, 0x(nil))) k(3 3 960, 10315648) pad(1 1) pool(1 1, 1 1)] P[ 60] C[ 62] 62 NN [( 7 7 960 1, 47040, 0x0x2f5a550(0x0x2f5a550, 0x(nil)) -> 7 7 320 1, 15680, 0x0x2f5e550(0x0x2f5e550, 0x(nil))) k(1 1 960, 323968) pad(0 0) pool(1 1, 1 1)] P[ 61] C[ 63] 63 NN [( 7 7 320 1, 15680, 0x0x2f5e550(0x0x2f5e550, 0x(nil)) -> 7 7 1280 1, 62720, 0x0x2f62400(0x0x2f62400, 0x(nil))) k(1 1 320, 435456) pad(0 0) pool(1 1, 1 1)] P[ 62] C[ 64] 64 SH [( 7 7 1280 1, 62720, 0x0x2f62400(0x0x2f62400, 0x(nil)) -> 1 1 1280 1, 1280, 0x0x2f66360(0x0x2f66360, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 63] C[ 65] 65 TP [(1280 1 1 1, 1280, 0x0x2f66360(0x0x2f66360, 0x(nil)) -> 1001 1 1 1, 1001, 0x0x2db2990(0x0x2db2990, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(1 1, 1 1)] P[ 64] C[ 66] 66 SH [(1001 1 1 1, 1001, 0x0x2db2990(0x0x2db2990, 0x(nil)) -> 1001 1 1 1, 1001, 0x0x2db1ae0(0x0x2db1ae0, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 65] id IN [ x y w h ] OUT [ x y w h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out) id | opid IN [ x y w h ] OUT [ x y w h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out) 0 | 0 TP DD 0x0 [ 0 0 3 224] -> DD 0x0 [ 0 0 224 224] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 1 | 1 TP DD 0x0 [ 0 0 224 224] -> DD 0x0 [ 0 0 113 113] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 2 | 2 NN DD 0x0 [ 0 0 113 113] -> DD 0x0 [ 0 0 112 112] ( 56, 2, 4) ( 2176, 1536, 100.00%, 85.71%, DD) ( 0, 0) 3 | 3 NN DD 0x0 [ 0 0 112 112] -> DD 0x0 [ 0 0 112 112] ( 56, 10, 4) ( 22528, 1536, 100.00%, 13.48%, DD) ( 0, 0) 4 | 4 NN DD 0x0 [ 0 0 112 112] -> DD 0x0 [ 0 0 112 112] ( 56, 2, 2) ( 3584, 1024, 100.00%, 160.00%, DD) ( 0, 0) 5 | 5 NN DD 0x0 [ 0 0 112 112] -> DD 0x0 [ 0 0 112 112] ( 32, 1, 12) ( 512, 2560, 100.00%, 125.00%, DD) ( 0, 0) 6 | 6 NN DD 0x0 [ 0 0 112 112] -> DD 0x0 [ 0 0 56 56] ( 56, 10, 6) ( 67584, 6656, 100.00%, 6.46%, DD) ( 0, 0) 7 | 7 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 3) ( 43008, 2560, 100.00%, 100.00%, DD) ( 0, 0) 8 | 8 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 6) ( 10752, 5120, 100.00%, 117.65%, DD) ( 0, 0) 9 | 9 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 6) ( 85248, 12800, 100.00%, 5.52%, DD) ( 0, 0) 10 | 10 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 3) ( 64512, 4096, 100.00%, 106.67%, DD) ( 0, 0) 11 | 11 NN DD 0x0 [ 0 0 128 588] -> DD 0x0 [ 0 0 128 588] ( 64, 12, 1) ( 1536, 512, 100.00%, 400.00%, DD) ( 0, 0) 12 | 12 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 6) ( 10752, 5120, 100.00%, 117.65%, DD) ( 0, 0) 13 | 13 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 28 28] ( 56, 10, 6) ( 101376, 13312, 100.00%, 5.74%, DD) ( 0, 0) 14 | 14 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 4) ( 57600, 5120, 100.00%, 102.56%, DD) ( 0, 0) 15 | 15 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 12800, 8192, 100.00%, 112.28%, DD) ( 0, 0) 16 | 16 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 92160, 21504, 100.00%, 5.23%, DD) ( 0, 0) 17 | 17 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 4) ( 76800, 6656, 100.00%, 100.00%, DD) ( 0, 0) 18 | 18 NN DD 0x0 [ 0 0 128 196] -> DD 0x0 [ 0 0 128 196] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 19 | 19 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 12800, 8192, 100.00%, 112.28%, DD) ( 0, 0) 20 | 20 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 92160, 21504, 100.00%, 5.22%, DD) ( 0, 0) 21 | 21 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 4) ( 76800, 6656, 100.00%, 100.00%, DD) ( 0, 0) 22 | 22 NN DD 0x0 [ 0 0 128 196] -> DD 0x0 [ 0 0 128 196] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 23 | 23 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 12800, 8192, 100.00%, 112.28%, DD) ( 0, 0) 24 | 24 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 14 14] ( 28, 16, 8) ( 104448, 21504, 100.00%, 5.23%, DD) ( 0, 0) 25 | 25 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 39936, 13312, 100.00%, 100.97%, DD) ( 0, 0) 26 | 26 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 13312, 28160, 100.00%, 102.33%, DD) ( 0, 0) 27 | 27 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 98304, 76800, 100.00%, 4.66%, DD) ( 0, 0) 28 | 28 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 79872, 25600, 100.00%, 98.04%, DD) ( 0, 0) 29 | 29 NN DD 0x0 [ 0 0 128 98] -> DD 0x0 [ 0 0 128 98] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 30 | 30 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 13312, 28160, 100.00%, 102.33%, DD) ( 0, 0) 31 | 31 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 98304, 76800, 100.00%, 4.66%, DD) ( 0, 0) 32 | 32 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 79872, 25600, 100.00%, 98.04%, DD) ( 0, 0) 33 | 33 NN DD 0x0 [ 0 0 128 98] -> DD 0x0 [ 0 0 128 98] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 34 | 34 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 13312, 28160, 100.00%, 102.33%, DD) ( 0, 0) 35 | 35 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 98304, 76800, 100.00%, 4.66%, DD) ( 0, 0) 36 | 36 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 79872, 25600, 100.00%, 98.04%, DD) ( 0, 0) 37 | 37 NN DD 0x0 [ 0 0 128 98] -> DD 0x0 [ 0 0 128 98] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 38 | 38 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 13312, 28160, 100.00%, 102.33%, DD) ( 0, 0) 39 | 39 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 98304, 76288, 100.00%, 4.63%, DD) ( 0, 0) 40 | 40 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 12) ( 79872, 37888, 100.00%, 96.73%, DD) ( 0, 0) 41 | 41 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 19968, 60416, 100.00%, 99.79%, DD) ( 0, 0) 42 | 42 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 147456, 165376, 100.00%, 4.46%, DD) ( 0, 0) 43 | 43 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 12) ( 119808, 56320, 100.00%, 96.28%, DD) ( 0, 0) 44 | 44 NN DD 0x0 [ 0 0 128 147] -> DD 0x0 [ 0 0 128 147] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 45 | 45 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 19968, 60416, 100.00%, 99.79%, DD) ( 0, 0) 46 | 46 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 147456, 165888, 100.00%, 4.47%, DD) ( 0, 0) 47 | 47 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 12) ( 119808, 56320, 100.00%, 96.28%, DD) ( 0, 0) 48 | 48 NN DD 0x0 [ 0 0 128 147] -> DD 0x0 [ 0 0 128 147] ( 64, 7, 1) ( 896, 512, 100.00%, 400.00%, DD) ( 0, 0) 49 | 49 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 19968, 60416, 100.00%, 99.79%, DD) ( 0, 0) 50 | 50 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 7 7] ( 14, 14, 18) ( 147456, 165888, 100.00%, 4.47%, DD) ( 0, 0) 51 | 51 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 20) ( 36864, 93696, 100.00%, 96.06%, DD) ( 0, 0) 52 | 52 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 30) ( 10240, 161792, 100.00%, 97.83%, DD) ( 0, 0) 53 | 53 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 30) ( 92160, 430080, 96.89%, 4.31%, DD) ( 0, 0) 54 | 54 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 20) ( 61440, 155136, 100.00%, 95.73%, DD) ( 0, 0) 55 | 55 NN DD 0x0 [ 0 0 32 245] -> DD 0x0 [ 0 0 32 245] ( 32, 18, 1) ( 1152, 512, 100.00%, 400.00%, DD) ( 0, 0) 56 | 56 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 30) ( 10240, 161792, 100.00%, 97.83%, DD) ( 0, 0) 57 | 57 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 30) ( 92160, 430080, 96.89%, 4.31%, DD) ( 0, 0) 58 | 58 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 20) ( 61440, 155136, 100.00%, 95.73%, DD) ( 0, 0) 59 | 59 NN DD 0x0 [ 0 0 32 245] -> DD 0x0 [ 0 0 32 245] ( 32, 18, 1) ( 1152, 512, 100.00%, 400.00%, DD) ( 0, 0) 60 | 60 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 30) ( 10240, 161792, 100.00%, 97.83%, DD) ( 0, 0) 61 | 61 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 30) ( 92160, 430080, 96.77%, 4.31%, DD) ( 0, 0) 62 | 62 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 20) ( 61440, 310272, 100.00%, 95.77%, DD) ( 0, 0) 63 | 63 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 32) ( 20480, 420352, 100.00%, 96.53%, DD) ( 0, 0) 64 | 64 SH DD 0x0 [ 0 0 0 0] -> DD 0x0 [ 0 0 0 0] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 65 | 65 TP DD 0x0 [ 0 0 1280 1] -> DD 0x0 [ 0 0 1001 1] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 66 | 66 SH DD 0x0 [ 0 0 0 0] -> DD 0x0 [ 0 0 0 0] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) PreLoadWeightBiases = 1048320 100.000000% ---------------------------End VerifyTiling ------------------------- KernelStreamSize: 0x583500, statesSize: 0x1c00, shShareMemSize: 0x0, shIntrSize: 0x700, shParaSize: 0x440, swParaSize: 0x0, lcdTensorSize: 0x0, shaderStatesSize: 0x9c0, tensorStatic: 0x0 NBG: operationSize: 0x86c, nnSize: 0x1f80, tpSize: 0x780, shSize: 0x10, swSize: 0x0, layerParamSize: 0x0, lcdtSize: 0x4a8, patchSize: 0x3de0, icdtSize: 0xe8 hwInitOpSize: 0x24, lcdSize 0x585d40 NBG: entranceSize: 0x208, nbIOSize: 0xe8, layeSize: 0x1398, sectionsSize: 0x7310, inputoutput size: 0x24fe9, InitCommands size: 0x1104 NBG: lcdSize: 0x585d40, headerSize : 0x8998 Calculate NBG size : 5830876 bytes generate NBG into memory start. vxoBinaryGraph_SaveBinaryEntrance[20461]: collect input count=0, output count=0 vxoBinaryGraph_SaveBinaryEntrance[20531]: total operation count=67 generate NBG, device count=1, core count per-device: 1, vxoBinaryGraph_RefineInputOutput:11143 input table address: 0x15469c0 vxoBinaryGraph_RefineInputOutput:11149 output table address: 0x18d4a80 vxoBinaryGraph_SaveBinaryEntranceExt[19524]: graph->inputCount=1, graph->outputCount=1, refine inputCount=1, outputCount=1 NBG network name field : dummy_network_name vxoBinaryGraph_SaveBinaryEntranceExt[20127]: header input count=1, output count=1 generate NBG, save initialize commands vxoBinaryGraph_ReSaveInputAndPatchTable[17202]: re-save operation count=74 Generate NBG in memory Actual NBG size : 5822976 bytes generate NBG into memory successfully. VsiNpuModule::GetFunction: get_symbol VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early =======imported_modules======== [Module(llvm, 4fb2c68), Module(vsi_npu, 5233ee8)] =======imported_modules[0]======== ; ModuleID = 'empty_module' source_filename = "empty_module" target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" target triple = "aarch64-linux-gnu" VsiNpuModule::SaveToBinary SaveToBinary: nbg size = 5822976 SaveToBinary: input size = 1 SaveToBinary: output size = 1 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule::SaveToBinary2 [[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 112 66 7 1 24 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] ```
``` khadas@Khadas:~$ python3 -m tvm.exec.rpc_server --host 0.0.0.0 --port=9090 INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork``` INFO:RPCServer:bind to 0.0.0.0:9090 INFO:RPCServer:connection from ('192.168.137.177', 34272) VsiNpuModule::LoadFromBinary LoadFromBinary: nbg size = 5822976 LoadFromBinary: input size = 1 LoadFromBinary: output size = 1 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 INFO:RPCServer:load_module /tmp/tmpum5rchg2/lib.so VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0 [ 1] PLS isn't existed Process Graph: 7 ms or 7670 us VsiNpuModule::GetFunction: size: 2 INFO:RPCServer:Finish serving ('192.168.137.177', 34272) ```
This is the output of local compilation on Khadas VIM3 Pro, using the quantized tflite model MobileNetV2.
``` khadas@Khadas:~/$ python3 test_vsi_tflite_model_all.py #[version = "0.0.5"] def @main(%input: Tensor[(1, 224, 224, 3), uint8], %v_param_1: Tensor[(3, 3, 3, 32), uint8], %v_param_2: Tensor[(32), int32], %v_param_3: Tensor[(3, 3, 32, 1), uint8], %v_param_4: Tensor[(32), int32], %v_param_5: Tensor[(1, 1, 32, 16), uint8], %v_param_6: Tensor[(16), int32], %v_param_7: Tensor[(1, 1, 16, 96), uint8], %v_param_8: Tensor[(96), int32], %v_param_9: Tensor[(3, 3, 96, 1), uint8], %v_param_10: Tensor[(96), int32], %v_param_11: Tensor[(1, 1, 96, 24), uint8], %v_param_12: Tensor[(24), int32], %v_param_13: Tensor[(1, 1, 24, 144), uint8], %v_param_14: Tensor[(144), int32], %v_param_15: Tensor[(3, 3, 144, 1), uint8], %v_param_16: Tensor[(144), int32], %v_param_17: Tensor[(1, 1, 144, 24), uint8], %v_param_18: Tensor[(24), int32], %v_param_19: Tensor[(1, 1, 24, 144), uint8], %v_param_20: Tensor[(144), int32], %v_param_21: Tensor[(3, 3, 144, 1), uint8], %v_param_22: Tensor[(144), int32], %v_param_23: Tensor[(1, 1, 144, 32), uint8], %v_param_24: Tensor[(32), int32], %v_param_25: Tensor[(1, 1, 32, 192), uint8], %v_param_26: Tensor[(192), int32], %v_param_27: Tensor[(3, 3, 192, 1), uint8], %v_param_28: Tensor[(192), int32], %v_param_29: Tensor[(1, 1, 192, 32), uint8], %v_param_30: Tensor[(32), int32], %v_param_31: Tensor[(1, 1, 32, 192), uint8], %v_param_32: Tensor[(192), int32], %v_param_33: Tensor[(3, 3, 192, 1), uint8], %v_param_34: Tensor[(192), int32], %v_param_35: Tensor[(1, 1, 192, 32), uint8], %v_param_36: Tensor[(32), int32], %v_param_37: Tensor[(1, 1, 32, 192), uint8], %v_param_38: Tensor[(192), int32], %v_param_39: Tensor[(3, 3, 192, 1), uint8], %v_param_40: Tensor[(192), int32], %v_param_41: Tensor[(1, 1, 192, 64), uint8], %v_param_42: Tensor[(64), int32], %v_param_43: Tensor[(1, 1, 64, 384), uint8], %v_param_44: Tensor[(384), int32], %v_param_45: Tensor[(3, 3, 384, 1), uint8], %v_param_46: Tensor[(384), int32], %v_param_47: Tensor[(1, 1, 384, 64), uint8], %v_param_48: Tensor[(64), int32], %v_param_49: Tensor[(1, 1, 64, 384), uint8], %v_param_50: Tensor[(384), int32], %v_param_51: Tensor[(3, 3, 384, 1), uint8], %v_param_52: Tensor[(384), int32], %v_param_53: Tensor[(1, 1, 384, 64), uint8], %v_param_54: Tensor[(64), int32], %v_param_55: Tensor[(1, 1, 64, 384), uint8], %v_param_56: Tensor[(384), int32], %v_param_57: Tensor[(3, 3, 384, 1), uint8], %v_param_58: Tensor[(384), int32], %v_param_59: Tensor[(1, 1, 384, 64), uint8], %v_param_60: Tensor[(64), int32], %v_param_61: Tensor[(1, 1, 64, 384), uint8], %v_param_62: Tensor[(384), int32], %v_param_63: Tensor[(3, 3, 384, 1), uint8], %v_param_64: Tensor[(384), int32], %v_param_65: Tensor[(1, 1, 384, 96), uint8], %v_param_66: Tensor[(96), int32], %v_param_67: Tensor[(1, 1, 96, 576), uint8], %v_param_68: Tensor[(576), int32], %v_param_69: Tensor[(3, 3, 576, 1), uint8], %v_param_70: Tensor[(576), int32], %v_param_71: Tensor[(1, 1, 576, 96), uint8], %v_param_72: Tensor[(96), int32], %v_param_73: Tensor[(1, 1, 96, 576), uint8], %v_param_74: Tensor[(576), int32], %v_param_75: Tensor[(3, 3, 576, 1), uint8], %v_param_76: Tensor[(576), int32], %v_param_77: Tensor[(1, 1, 576, 96), uint8], %v_param_78: Tensor[(96), int32], %v_param_79: Tensor[(1, 1, 96, 576), uint8], %v_param_80: Tensor[(576), int32], %v_param_81: Tensor[(3, 3, 576, 1), uint8], %v_param_82: Tensor[(576), int32], %v_param_83: Tensor[(1, 1, 576, 160), uint8], %v_param_84: Tensor[(160), int32], %v_param_85: Tensor[(1, 1, 160, 960), uint8], %v_param_86: Tensor[(960), int32], %v_param_87: Tensor[(3, 3, 960, 1), uint8], %v_param_88: Tensor[(960), int32], %v_param_89: Tensor[(1, 1, 960, 160), uint8], %v_param_90: Tensor[(160), int32], %v_param_91: Tensor[(1, 1, 160, 960), uint8], %v_param_92: Tensor[(960), int32], %v_param_93: Tensor[(3, 3, 960, 1), uint8], %v_param_94: Tensor[(960), int32], %v_param_95: Tensor[(1, 1, 960, 160), uint8], %v_param_96: Tensor[(160), int32], %v_param_97: Tensor[(1, 1, 160, 960), uint8], %v_param_98: Tensor[(960), int32], %v_param_99: Tensor[(3, 3, 960, 1), uint8], %v_param_100: Tensor[(960), int32], %v_param_101: Tensor[(1, 1, 960, 320), uint8], %v_param_102: Tensor[(320), int32], %v_param_103: Tensor[(1, 1, 320, 1280), uint8], %v_param_104: Tensor[(1280), int32], %v_param_105: Tensor[(1, 1, 1280, 1001), uint8], %v_param_106: Tensor[(1001), int32]) { %0 = qnn.conv2d(%input, %v_param_1, 128, 115, 0.00787402f, 0.0287749f, strides=[2, 2], padding=[0, 0, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %1 = nn.bias_add(%0, %v_param_2, axis=3); %2 = qnn.requantize(%1, 0.000226574f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %3 = qnn.conv2d(%2, %v_param_3, 0, 165, 0.0235285f, 0.343696f, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %4 = nn.bias_add(%3, %v_param_4, axis=3); %5 = qnn.requantize(%4, 0.00808663f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %6 = qnn.conv2d(%5, %v_param_5, 0, 141, 0.0235285f, 0.0381986f, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %7 = nn.bias_add(%6, %v_param_6, axis=3); %8 = qnn.requantize(%7, 0.000898756f, 0, 0.362873f, 122, axis=3, out_dtype="uint8"); %9 = qnn.conv2d(%8, %v_param_7, 122, 127, 0.362873f, 0.00954309f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %10 = nn.bias_add(%9, %v_param_8, axis=3); %11 = qnn.requantize(%10, 0.00346293f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %12 = qnn.conv2d(%11, %v_param_9, 0, 109, 0.0235285f, 0.0194444f, strides=[2, 2], padding=[0, 0, 1, 1], groups=96, channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %13 = nn.bias_add(%12, %v_param_10, axis=3); %14 = qnn.requantize(%13, 0.000457496f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %15 = qnn.conv2d(%14, %v_param_11, 0, 152, 0.0235285f, 0.0225397f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %16 = nn.bias_add(%15, %v_param_12, axis=3); %17 = qnn.requantize(%16, 0.000530324f, 0, 0.282426f, 122, axis=3, out_dtype="uint8"); %18 = qnn.conv2d(%17, %v_param_13, 122, 145, 0.282426f, 0.00369501f, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %19 = nn.bias_add(%18, %v_param_14, axis=3); %20 = qnn.requantize(%19, 0.00104357f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %21 = qnn.conv2d(%20, %v_param_15, 0, 52, 0.0235285f, 0.169819f, padding=[1, 1, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %22 = nn.bias_add(%21, %v_param_16, axis=3); %23 = qnn.requantize(%22, 0.00399559f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %24 = qnn.conv2d(%23, %v_param_17, 0, 122, 0.0235285f, 0.026759f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %25 = nn.bias_add(%24, %v_param_18, axis=3); %26 = qnn.requantize(%25, 0.000629599f, 0, 0.410429f, 137, axis=3, out_dtype="uint8"); %27 = qnn.add(%26, %17, 0.410429f, 137, 0.282426f, 122, 0.448443f, 130); %28 = qnn.conv2d(%27, %v_param_19, 130, 104, 0.448443f, 0.0029434f, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %29 = nn.bias_add(%28, %v_param_20, axis=3); %30 = qnn.requantize(%29, 0.00131995f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %31 = qnn.conv2d(%30, %v_param_21, 0, 144, 0.0235285f, 0.0171147f, strides=[2, 2], padding=[0, 0, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %32 = nn.bias_add(%31, %v_param_22, axis=3); %33 = qnn.requantize(%32, 0.000402683f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %34 = qnn.conv2d(%33, %v_param_23, 0, 114, 0.0235285f, 0.016776f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %35 = nn.bias_add(%34, %v_param_24, axis=3); %36 = qnn.requantize(%35, 0.000394715f, 0, 0.224783f, 128, axis=3, out_dtype="uint8"); %37 = qnn.conv2d(%36, %v_param_25, 128, 122, 0.224783f, 0.00210703f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %38 = nn.bias_add(%37, %v_param_26, axis=3); %39 = qnn.requantize(%38, 0.000473626f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %40 = qnn.conv2d(%39, %v_param_27, 0, 111, 0.0235285f, 0.0671548f, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %41 = nn.bias_add(%40, %v_param_28, axis=3); %42 = qnn.requantize(%41, 0.00158005f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %43 = qnn.conv2d(%42, %v_param_29, 0, 148, 0.0235285f, 0.0199821f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %44 = nn.bias_add(%43, %v_param_30, axis=3); %45 = qnn.requantize(%44, 0.000470149f, 0, 0.231107f, 120, axis=3, out_dtype="uint8"); %46 = qnn.add(%45, %36, 0.231107f, 120, 0.224783f, 128, 0.271938f, 130); %47 = qnn.conv2d(%46, %v_param_31, 130, 119, 0.271938f, 0.00149126f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %48 = nn.bias_add(%47, %v_param_32, axis=3); %49 = qnn.requantize(%48, 0.00040553f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %50 = qnn.conv2d(%49, %v_param_33, 0, 89, 0.0235285f, 0.0805961f, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %51 = nn.bias_add(%50, %v_param_34, axis=3); %52 = qnn.requantize(%51, 0.0018963f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %53 = qnn.conv2d(%52, %v_param_35, 0, 127, 0.0235285f, 0.018966f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %54 = nn.bias_add(%53, %v_param_36, axis=3); %55 = qnn.requantize(%54, 0.00044624f, 0, 0.268485f, 124, axis=3, out_dtype="uint8"); %56 = qnn.add(%55, %46, 0.268485f, 124, 0.271938f, 130, 0.349583f, 124); %57 = qnn.conv2d(%56, %v_param_37, 124, 129, 0.349583f, 0.00188541f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %58 = nn.bias_add(%57, %v_param_38, axis=3); %59 = qnn.requantize(%58, 0.000659109f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %60 = qnn.conv2d(%59, %v_param_39, 0, 129, 0.0235285f, 0.00993869f, strides=[2, 2], padding=[0, 0, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %61 = nn.bias_add(%60, %v_param_40, axis=3); %62 = qnn.requantize(%61, 0.000233842f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %63 = qnn.conv2d(%62, %v_param_41, 0, 144, 0.0235285f, 0.0145759f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %64 = nn.bias_add(%63, %v_param_42, axis=3); %65 = qnn.requantize(%64, 0.000342948f, 0, 0.193133f, 125, axis=3, out_dtype="uint8"); %66 = qnn.conv2d(%65, %v_param_43, 125, 126, 0.193133f, 0.00157124f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %67 = nn.bias_add(%66, %v_param_44, axis=3); %68 = qnn.requantize(%67, 0.000303459f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %69 = qnn.conv2d(%68, %v_param_45, 0, 105, 0.0235285f, 0.0612184f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %70 = nn.bias_add(%69, %v_param_46, axis=3); %71 = qnn.requantize(%70, 0.00144038f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %72 = qnn.conv2d(%71, %v_param_47, 0, 127, 0.0235285f, 0.0187498f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %73 = nn.bias_add(%72, %v_param_48, axis=3); %74 = qnn.requantize(%73, 0.000441155f, 0, 0.180298f, 108, axis=3, out_dtype="uint8"); %75 = qnn.add(%74, %65, 0.180298f, 108, 0.193133f, 125, 0.197618f, 120); %76 = qnn.conv2d(%75, %v_param_49, 120, 135, 0.197618f, 0.00145681f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %77 = nn.bias_add(%76, %v_param_50, axis=3); %78 = qnn.requantize(%77, 0.000287892f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %79 = qnn.conv2d(%78, %v_param_51, 0, 133, 0.0235285f, 0.0509263f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %80 = nn.bias_add(%79, %v_param_52, axis=3); %81 = qnn.requantize(%80, 0.00119822f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %82 = qnn.conv2d(%81, %v_param_53, 0, 126, 0.0235285f, 0.0130952f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %83 = nn.bias_add(%82, %v_param_54, axis=3); %84 = qnn.requantize(%83, 0.000308111f, 0, 0.152346f, 125, axis=3, out_dtype="uint8"); %85 = qnn.add(%84, %75, 0.152346f, 125, 0.197618f, 120, 0.209317f, 123); %86 = qnn.conv2d(%85, %v_param_55, 123, 127, 0.209317f, 0.00133576f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %87 = nn.bias_add(%86, %v_param_56, axis=3); %88 = qnn.requantize(%87, 0.000279598f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %89 = qnn.conv2d(%88, %v_param_57, 0, 156, 0.0235285f, 0.0404159f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %90 = nn.bias_add(%89, %v_param_58, axis=3); %91 = qnn.requantize(%90, 0.000950924f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %92 = qnn.conv2d(%91, %v_param_59, 0, 148, 0.0235285f, 0.0192269f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %93 = nn.bias_add(%92, %v_param_60, axis=3); %94 = qnn.requantize(%93, 0.00045238f, 0, 0.16256f, 119, axis=3, out_dtype="uint8"); %95 = qnn.add(%94, %85, 0.16256f, 119, 0.209317f, 123, 0.227132f, 122); %96 = qnn.conv2d(%95, %v_param_61, 122, 132, 0.227132f, 0.00162901f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %97 = nn.bias_add(%96, %v_param_62, axis=3); %98 = qnn.requantize(%97, 0.000370001f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %99 = qnn.conv2d(%98, %v_param_63, 0, 142, 0.0235285f, 0.0308997f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %100 = nn.bias_add(%99, %v_param_64, axis=3); %101 = qnn.requantize(%100, 0.000727024f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %102 = qnn.conv2d(%101, %v_param_65, 0, 128, 0.0235285f, 0.00727967f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %103 = nn.bias_add(%102, %v_param_66, axis=3); %104 = qnn.requantize(%103, 0.000171279f, 0, 0.172015f, 128, axis=3, out_dtype="uint8"); %105 = qnn.conv2d(%104, %v_param_67, 128, 131, 0.172015f, 0.00161979f, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %106 = nn.bias_add(%105, %v_param_68, axis=3); %107 = qnn.requantize(%106, 0.000278629f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %108 = qnn.conv2d(%107, %v_param_69, 0, 66, 0.0235285f, 0.0708156f, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %109 = nn.bias_add(%108, %v_param_70, axis=3); %110 = qnn.requantize(%109, 0.00166618f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %111 = qnn.conv2d(%110, %v_param_71, 0, 135, 0.0235285f, 0.00841983f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %112 = nn.bias_add(%111, %v_param_72, axis=3); %113 = qnn.requantize(%112, 0.000198106f, 0, 0.128486f, 127, axis=3, out_dtype="uint8"); %114 = qnn.add(%113, %104, 0.128486f, 127, 0.172015f, 128, 0.179783f, 126); %115 = qnn.conv2d(%114, %v_param_73, 126, 138, 0.179783f, 0.00180177f, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %116 = nn.bias_add(%115, %v_param_74, axis=3); %117 = qnn.requantize(%116, 0.000323928f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %118 = qnn.conv2d(%117, %v_param_75, 0, 154, 0.0235285f, 0.0698695f, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %119 = nn.bias_add(%118, %v_param_76, axis=3); %120 = qnn.requantize(%119, 0.00164392f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %121 = qnn.conv2d(%120, %v_param_77, 0, 155, 0.0235285f, 0.0236749f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %122 = nn.bias_add(%121, %v_param_78, axis=3); %123 = qnn.requantize(%122, 0.000557034f, 0, 0.190479f, 127, axis=3, out_dtype="uint8"); %124 = qnn.add(%123, %114, 0.190479f, 127, 0.179783f, 126, 0.245143f, 126); %125 = qnn.conv2d(%124, %v_param_79, 126, 125, 0.245143f, 0.00139799f, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %126 = nn.bias_add(%125, %v_param_80, axis=3); %127 = qnn.requantize(%126, 0.000342707f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %128 = qnn.conv2d(%127, %v_param_81, 0, 92, 0.0235285f, 0.0148872f, strides=[2, 2], padding=[0, 0, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %129 = nn.bias_add(%128, %v_param_82, axis=3); %130 = qnn.requantize(%129, 0.000350273f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %131 = qnn.conv2d(%130, %v_param_83, 0, 139, 0.0235285f, 0.00922072f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %132 = nn.bias_add(%131, %v_param_84, axis=3); %133 = qnn.requantize(%132, 0.00021695f, 0, 0.131885f, 131, axis=3, out_dtype="uint8"); %134 = qnn.conv2d(%133, %v_param_85, 131, 141, 0.131885f, 0.00211018f, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %135 = nn.bias_add(%134, %v_param_86, axis=3); %136 = qnn.requantize(%135, 0.000278301f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %137 = qnn.conv2d(%136, %v_param_87, 0, 146, 0.0235285f, 0.0409658f, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %138 = nn.bias_add(%137, %v_param_88, axis=3); %139 = qnn.requantize(%138, 0.000963862f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %140 = qnn.conv2d(%139, %v_param_89, 0, 136, 0.0235285f, 0.00783742f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %141 = nn.bias_add(%140, %v_param_90, axis=3); %142 = qnn.requantize(%141, 0.000184403f, 0, 0.104162f, 130, axis=3, out_dtype="uint8"); %143 = qnn.add(%142, %133, 0.104162f, 130, 0.131885f, 131, 0.15034f, 133); %144 = qnn.conv2d(%143, %v_param_91, 133, 129, 0.15034f, 0.00163117f, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %145 = nn.bias_add(%144, %v_param_92, axis=3); %146 = qnn.requantize(%145, 0.00024523f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %147 = qnn.conv2d(%146, %v_param_93, 0, 102, 0.0235285f, 0.0439425f, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %148 = nn.bias_add(%147, %v_param_94, axis=3); %149 = qnn.requantize(%148, 0.0010339f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %150 = qnn.conv2d(%149, %v_param_95, 0, 132, 0.0235285f, 0.0380282f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %151 = nn.bias_add(%150, %v_param_96, axis=3); %152 = qnn.requantize(%151, 0.000894746f, 0, 0.179058f, 134, axis=3, out_dtype="uint8"); %153 = qnn.add(%152, %143, 0.179058f, 134, 0.15034f, 133, 0.220417f, 131); %154 = qnn.conv2d(%153, %v_param_97, 131, 131, 0.220417f, 0.00206415f, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %155 = nn.bias_add(%154, %v_param_98, axis=3); %156 = qnn.requantize(%155, 0.000454974f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %157 = qnn.conv2d(%156, %v_param_99, 0, 201, 0.0235285f, 0.158864f, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32"); %158 = nn.bias_add(%157, %v_param_100, axis=3); %159 = qnn.requantize(%158, 0.00373784f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %160 = qnn.conv2d(%159, %v_param_101, 0, 111, 0.0235285f, 0.00962106f, padding=[0, 0, 0, 0], channels=320, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %161 = nn.bias_add(%160, %v_param_102, axis=3); %162 = qnn.requantize(%161, 0.000226369f, 0, 0.131263f, 143, axis=3, out_dtype="uint8"); %163 = qnn.conv2d(%162, %v_param_103, 143, 128, 0.131263f, 0.00524072f, padding=[0, 0, 0, 0], channels=1280, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %164 = nn.bias_add(%163, %v_param_104, axis=3); %165 = qnn.requantize(%164, 0.000687913f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8"); %166 = cast(%165, dtype="int32"); %167 = nn.avg_pool2d(%166, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC"); %168 = cast(%167, dtype="uint8"); %169 = qnn.conv2d(%168, %v_param_105, 0, 114, 0.0235285f, 0.00168582f, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32"); %170 = nn.bias_add(%169, %v_param_106, axis=3); %171 = qnn.requantize(%170, 3.96648e-05f, 0, 0.0760416f, 72, axis=3, out_dtype="uint8"); %172 = reshape(%171, newshape=[1, 1001]); %173 = qnn.dequantize(%172, 0.0760416f, 72); %174 = nn.softmax(%173, axis=1); qnn.quantize(%174, 0.00390625f, 0, out_dtype="uint8") } vsi_npu.py --> qnn.dequantize vsi_npu.py --> nn.softmax vsi_npu.py --> qnn.quantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.avg_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> qnn.add vsi_npu.py --> reshape def @main(%input: Tensor[(1, 224, 224, 3), uint8]) -> Tensor[(1, 1001), uint8] { @tvmgen_default_vsi_npu_0(%input) /* ty=Tensor[(1, 1001), uint8] */ } def @tvmgen_default_vsi_npu_0(%vsi_npu_0_i0: Tensor[(1, 224, 224, 3), uint8], Inline=1, Compiler="vsi_npu", global_symbol="tvmgen_default_vsi_npu_0", Primitive=1) -> Tensor[(1, 1001), uint8] { %110 = fn (%FunctionVar_52_0: Tensor[(1, 224, 224, 3), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 32), uint8] { %108 = qnn.conv2d(%FunctionVar_52_0, meta[relay.Constant][104] /* ty=Tensor[(3, 3, 3, 32), uint8] */, 128 /* ty=int32 */, 115 /* ty=int32 */, 0.00787402f /* ty=float32 */, 0.0287749f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 32), int32] */; %109 = nn.bias_add(%108, meta[relay.Constant][105] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 32), int32] */; qnn.requantize(%109, 0.000226574f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 32), uint8] */ }; %111 = %110(%vsi_npu_0_i0) /* ty=Tensor[(1, 112, 112, 32), uint8] */; %112 = fn (%FunctionVar_51_0: Tensor[(1, 112, 112, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 32), uint8] { %106 = qnn.conv2d(%FunctionVar_51_0, meta[relay.Constant][102] /* ty=Tensor[(3, 3, 32, 1), uint8] */, 0 /* ty=int32 */, 165 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.343696f /* ty=float32 */, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 32), int32] */; %107 = nn.bias_add(%106, meta[relay.Constant][103] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 32), int32] */; qnn.requantize(%107, 0.00808663f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 32), uint8] */ }; %113 = %112(%111) /* ty=Tensor[(1, 112, 112, 32), uint8] */; %114 = fn (%FunctionVar_50_0: Tensor[(1, 112, 112, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 16), uint8] { %104 = qnn.conv2d(%FunctionVar_50_0, meta[relay.Constant][100] /* ty=Tensor[(1, 1, 32, 16), uint8] */, 0 /* ty=int32 */, 141 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0381986f /* ty=float32 */, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 16), int32] */; %105 = nn.bias_add(%104, meta[relay.Constant][101] /* ty=Tensor[(16), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 16), int32] */; qnn.requantize(%105, 0.000898756f /* ty=float32 */, 0 /* ty=int32 */, 0.362873f /* ty=float32 */, 122 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 16), uint8] */ }; %115 = %114(%113) /* ty=Tensor[(1, 112, 112, 16), uint8] */; %116 = fn (%FunctionVar_49_0: Tensor[(1, 112, 112, 16), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 96), uint8] { %102 = qnn.conv2d(%FunctionVar_49_0, meta[relay.Constant][98] /* ty=Tensor[(1, 1, 16, 96), uint8] */, 122 /* ty=int32 */, 127 /* ty=int32 */, 0.362873f /* ty=float32 */, 0.00954309f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 96), int32] */; %103 = nn.bias_add(%102, meta[relay.Constant][99] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 96), int32] */; qnn.requantize(%103, 0.00346293f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 96), uint8] */ }; %117 = %116(%115) /* ty=Tensor[(1, 112, 112, 96), uint8] */; %118 = fn (%FunctionVar_48_0: Tensor[(1, 112, 112, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 96), uint8] { %100 = qnn.conv2d(%FunctionVar_48_0, meta[relay.Constant][96] /* ty=Tensor[(3, 3, 96, 1), uint8] */, 0 /* ty=int32 */, 109 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0194444f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=96, channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 96), int32] */; %101 = nn.bias_add(%100, meta[relay.Constant][97] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 96), int32] */; qnn.requantize(%101, 0.000457496f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 96), uint8] */ }; %119 = %118(%117) /* ty=Tensor[(1, 56, 56, 96), uint8] */; %120 = fn (%FunctionVar_47_0: Tensor[(1, 56, 56, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 24), uint8] { %98 = qnn.conv2d(%FunctionVar_47_0, meta[relay.Constant][94] /* ty=Tensor[(1, 1, 96, 24), uint8] */, 0 /* ty=int32 */, 152 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0225397f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 24), int32] */; %99 = nn.bias_add(%98, meta[relay.Constant][95] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 24), int32] */; qnn.requantize(%99, 0.000530324f /* ty=float32 */, 0 /* ty=int32 */, 0.282426f /* ty=float32 */, 122 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 24), uint8] */ }; %121 = %120(%119) /* ty=Tensor[(1, 56, 56, 24), uint8] */; %122 = fn (%FunctionVar_46_0: Tensor[(1, 56, 56, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 144), uint8] { %96 = qnn.conv2d(%FunctionVar_46_0, meta[relay.Constant][92] /* ty=Tensor[(1, 1, 24, 144), uint8] */, 122 /* ty=int32 */, 145 /* ty=int32 */, 0.282426f /* ty=float32 */, 0.00369501f /* ty=float32 */, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 144), int32] */; %97 = nn.bias_add(%96, meta[relay.Constant][93] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 144), int32] */; qnn.requantize(%97, 0.00104357f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 144), uint8] */ }; %123 = %122(%121) /* ty=Tensor[(1, 56, 56, 144), uint8] */; %124 = fn (%FunctionVar_45_0: Tensor[(1, 56, 56, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 144), uint8] { %94 = qnn.conv2d(%FunctionVar_45_0, meta[relay.Constant][90] /* ty=Tensor[(3, 3, 144, 1), uint8] */, 0 /* ty=int32 */, 52 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.169819f /* ty=float32 */, padding=[1, 1, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 144), int32] */; %95 = nn.bias_add(%94, meta[relay.Constant][91] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 144), int32] */; qnn.requantize(%95, 0.00399559f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 144), uint8] */ }; %125 = %124(%123) /* ty=Tensor[(1, 56, 56, 144), uint8] */; %126 = fn (%FunctionVar_44_0: Tensor[(1, 56, 56, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 24), uint8] { %92 = qnn.conv2d(%FunctionVar_44_0, meta[relay.Constant][88] /* ty=Tensor[(1, 1, 144, 24), uint8] */, 0 /* ty=int32 */, 122 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.026759f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 24), int32] */; %93 = nn.bias_add(%92, meta[relay.Constant][89] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 24), int32] */; qnn.requantize(%93, 0.000629599f /* ty=float32 */, 0 /* ty=int32 */, 0.410429f /* ty=float32 */, 137 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 24), uint8] */ }; %127 = %126(%125) /* ty=Tensor[(1, 56, 56, 24), uint8] */; %128 = qnn.add(%127, %121, 0.410429f /* ty=float32 */, 137 /* ty=int32 */, 0.282426f /* ty=float32 */, 122 /* ty=int32 */, 0.448443f /* ty=float32 */, 130 /* ty=int32 */) /* ty=Tensor[(1, 56, 56, 24), uint8] */; %129 = fn (%FunctionVar_43_0: Tensor[(1, 56, 56, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 144), uint8] { %90 = qnn.conv2d(%FunctionVar_43_0, meta[relay.Constant][86] /* ty=Tensor[(1, 1, 24, 144), uint8] */, 130 /* ty=int32 */, 104 /* ty=int32 */, 0.448443f /* ty=float32 */, 0.0029434f /* ty=float32 */, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 144), int32] */; %91 = nn.bias_add(%90, meta[relay.Constant][87] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 144), int32] */; qnn.requantize(%91, 0.00131995f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 144), uint8] */ }; %130 = %129(%128) /* ty=Tensor[(1, 56, 56, 144), uint8] */; %131 = fn (%FunctionVar_42_0: Tensor[(1, 56, 56, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 144), uint8] { %88 = qnn.conv2d(%FunctionVar_42_0, meta[relay.Constant][84] /* ty=Tensor[(3, 3, 144, 1), uint8] */, 0 /* ty=int32 */, 144 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0171147f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 144), int32] */; %89 = nn.bias_add(%88, meta[relay.Constant][85] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 144), int32] */; qnn.requantize(%89, 0.000402683f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 144), uint8] */ }; %132 = %131(%130) /* ty=Tensor[(1, 28, 28, 144), uint8] */; %133 = fn (%FunctionVar_41_0: Tensor[(1, 28, 28, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] { %86 = qnn.conv2d(%FunctionVar_41_0, meta[relay.Constant][82] /* ty=Tensor[(1, 1, 144, 32), uint8] */, 0 /* ty=int32 */, 114 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.016776f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */; %87 = nn.bias_add(%86, meta[relay.Constant][83] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */; qnn.requantize(%87, 0.000394715f /* ty=float32 */, 0 /* ty=int32 */, 0.224783f /* ty=float32 */, 128 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */ }; %134 = %133(%132) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %135 = fn (%FunctionVar_40_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] { %84 = qnn.conv2d(%FunctionVar_40_0, meta[relay.Constant][80] /* ty=Tensor[(1, 1, 32, 192), uint8] */, 128 /* ty=int32 */, 122 /* ty=int32 */, 0.224783f /* ty=float32 */, 0.00210703f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */; %85 = nn.bias_add(%84, meta[relay.Constant][81] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */; qnn.requantize(%85, 0.000473626f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */ }; %136 = %135(%134) /* ty=Tensor[(1, 28, 28, 192), uint8] */; %137 = fn (%FunctionVar_39_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] { %82 = qnn.conv2d(%FunctionVar_39_0, meta[relay.Constant][78] /* ty=Tensor[(3, 3, 192, 1), uint8] */, 0 /* ty=int32 */, 111 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0671548f /* ty=float32 */, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */; %83 = nn.bias_add(%82, meta[relay.Constant][79] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */; qnn.requantize(%83, 0.00158005f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */ }; %138 = %137(%136) /* ty=Tensor[(1, 28, 28, 192), uint8] */; %139 = fn (%FunctionVar_38_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] { %80 = qnn.conv2d(%FunctionVar_38_0, meta[relay.Constant][76] /* ty=Tensor[(1, 1, 192, 32), uint8] */, 0 /* ty=int32 */, 148 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0199821f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */; %81 = nn.bias_add(%80, meta[relay.Constant][77] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */; qnn.requantize(%81, 0.000470149f /* ty=float32 */, 0 /* ty=int32 */, 0.231107f /* ty=float32 */, 120 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */ }; %140 = %139(%138) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %141 = qnn.add(%140, %134, 0.231107f /* ty=float32 */, 120 /* ty=int32 */, 0.224783f /* ty=float32 */, 128 /* ty=int32 */, 0.271938f /* ty=float32 */, 130 /* ty=int32 */) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %142 = fn (%FunctionVar_37_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] { %78 = qnn.conv2d(%FunctionVar_37_0, meta[relay.Constant][74] /* ty=Tensor[(1, 1, 32, 192), uint8] */, 130 /* ty=int32 */, 119 /* ty=int32 */, 0.271938f /* ty=float32 */, 0.00149126f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */; %79 = nn.bias_add(%78, meta[relay.Constant][75] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */; qnn.requantize(%79, 0.00040553f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */ }; %143 = %142(%141) /* ty=Tensor[(1, 28, 28, 192), uint8] */; %144 = fn (%FunctionVar_36_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] { %76 = qnn.conv2d(%FunctionVar_36_0, meta[relay.Constant][72] /* ty=Tensor[(3, 3, 192, 1), uint8] */, 0 /* ty=int32 */, 89 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0805961f /* ty=float32 */, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */; %77 = nn.bias_add(%76, meta[relay.Constant][73] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */; qnn.requantize(%77, 0.0018963f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */ }; %145 = %144(%143) /* ty=Tensor[(1, 28, 28, 192), uint8] */; %146 = fn (%FunctionVar_35_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] { %74 = qnn.conv2d(%FunctionVar_35_0, meta[relay.Constant][70] /* ty=Tensor[(1, 1, 192, 32), uint8] */, 0 /* ty=int32 */, 127 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.018966f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */; %75 = nn.bias_add(%74, meta[relay.Constant][71] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */; qnn.requantize(%75, 0.00044624f /* ty=float32 */, 0 /* ty=int32 */, 0.268485f /* ty=float32 */, 124 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */ }; %147 = %146(%145) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %148 = qnn.add(%147, %141, 0.268485f /* ty=float32 */, 124 /* ty=int32 */, 0.271938f /* ty=float32 */, 130 /* ty=int32 */, 0.349583f /* ty=float32 */, 124 /* ty=int32 */) /* ty=Tensor[(1, 28, 28, 32), uint8] */; %149 = fn (%FunctionVar_34_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] { %72 = qnn.conv2d(%FunctionVar_34_0, meta[relay.Constant][68] /* ty=Tensor[(1, 1, 32, 192), uint8] */, 124 /* ty=int32 */, 129 /* ty=int32 */, 0.349583f /* ty=float32 */, 0.00188541f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */; %73 = nn.bias_add(%72, meta[relay.Constant][69] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */; qnn.requantize(%73, 0.000659109f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */ }; %150 = %149(%148) /* ty=Tensor[(1, 28, 28, 192), uint8] */; %151 = fn (%FunctionVar_33_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 192), uint8] { %70 = qnn.conv2d(%FunctionVar_33_0, meta[relay.Constant][66] /* ty=Tensor[(3, 3, 192, 1), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00993869f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 192), int32] */; %71 = nn.bias_add(%70, meta[relay.Constant][67] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 192), int32] */; qnn.requantize(%71, 0.000233842f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 192), uint8] */ }; %152 = %151(%150) /* ty=Tensor[(1, 14, 14, 192), uint8] */; %153 = fn (%FunctionVar_32_0: Tensor[(1, 14, 14, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %68 = qnn.conv2d(%FunctionVar_32_0, meta[relay.Constant][64] /* ty=Tensor[(1, 1, 192, 64), uint8] */, 0 /* ty=int32 */, 144 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0145759f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %69 = nn.bias_add(%68, meta[relay.Constant][65] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%69, 0.000342948f /* ty=float32 */, 0 /* ty=int32 */, 0.193133f /* ty=float32 */, 125 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %154 = %153(%152) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %155 = fn (%FunctionVar_31_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %66 = qnn.conv2d(%FunctionVar_31_0, meta[relay.Constant][62] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 125 /* ty=int32 */, 126 /* ty=int32 */, 0.193133f /* ty=float32 */, 0.00157124f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %67 = nn.bias_add(%66, meta[relay.Constant][63] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%67, 0.000303459f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %156 = %155(%154) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %157 = fn (%FunctionVar_30_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %64 = qnn.conv2d(%FunctionVar_30_0, meta[relay.Constant][60] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 105 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0612184f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %65 = nn.bias_add(%64, meta[relay.Constant][61] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%65, 0.00144038f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %158 = %157(%156) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %159 = fn (%FunctionVar_29_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %62 = qnn.conv2d(%FunctionVar_29_0, meta[relay.Constant][58] /* ty=Tensor[(1, 1, 384, 64), uint8] */, 0 /* ty=int32 */, 127 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0187498f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %63 = nn.bias_add(%62, meta[relay.Constant][59] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%63, 0.000441155f /* ty=float32 */, 0 /* ty=int32 */, 0.180298f /* ty=float32 */, 108 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %160 = %159(%158) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %161 = qnn.add(%160, %154, 0.180298f /* ty=float32 */, 108 /* ty=int32 */, 0.193133f /* ty=float32 */, 125 /* ty=int32 */, 0.197618f /* ty=float32 */, 120 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %162 = fn (%FunctionVar_28_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %60 = qnn.conv2d(%FunctionVar_28_0, meta[relay.Constant][56] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 120 /* ty=int32 */, 135 /* ty=int32 */, 0.197618f /* ty=float32 */, 0.00145681f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %61 = nn.bias_add(%60, meta[relay.Constant][57] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%61, 0.000287892f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %163 = %162(%161) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %164 = fn (%FunctionVar_27_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %58 = qnn.conv2d(%FunctionVar_27_0, meta[relay.Constant][54] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 133 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0509263f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %59 = nn.bias_add(%58, meta[relay.Constant][55] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%59, 0.00119822f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %165 = %164(%163) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %166 = fn (%FunctionVar_26_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %56 = qnn.conv2d(%FunctionVar_26_0, meta[relay.Constant][52] /* ty=Tensor[(1, 1, 384, 64), uint8] */, 0 /* ty=int32 */, 126 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0130952f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %57 = nn.bias_add(%56, meta[relay.Constant][53] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%57, 0.000308111f /* ty=float32 */, 0 /* ty=int32 */, 0.152346f /* ty=float32 */, 125 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %167 = %166(%165) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %168 = qnn.add(%167, %161, 0.152346f /* ty=float32 */, 125 /* ty=int32 */, 0.197618f /* ty=float32 */, 120 /* ty=int32 */, 0.209317f /* ty=float32 */, 123 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %169 = fn (%FunctionVar_25_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %54 = qnn.conv2d(%FunctionVar_25_0, meta[relay.Constant][50] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 123 /* ty=int32 */, 127 /* ty=int32 */, 0.209317f /* ty=float32 */, 0.00133576f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %55 = nn.bias_add(%54, meta[relay.Constant][51] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%55, 0.000279598f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %170 = %169(%168) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %171 = fn (%FunctionVar_24_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %52 = qnn.conv2d(%FunctionVar_24_0, meta[relay.Constant][48] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 156 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0404159f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %53 = nn.bias_add(%52, meta[relay.Constant][49] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%53, 0.000950924f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %172 = %171(%170) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %173 = fn (%FunctionVar_23_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] { %50 = qnn.conv2d(%FunctionVar_23_0, meta[relay.Constant][46] /* ty=Tensor[(1, 1, 384, 64), uint8] */, 0 /* ty=int32 */, 148 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0192269f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */; %51 = nn.bias_add(%50, meta[relay.Constant][47] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */; qnn.requantize(%51, 0.00045238f /* ty=float32 */, 0 /* ty=int32 */, 0.16256f /* ty=float32 */, 119 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */ }; %174 = %173(%172) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %175 = qnn.add(%174, %168, 0.16256f /* ty=float32 */, 119 /* ty=int32 */, 0.209317f /* ty=float32 */, 123 /* ty=int32 */, 0.227132f /* ty=float32 */, 122 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 64), uint8] */; %176 = fn (%FunctionVar_22_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %48 = qnn.conv2d(%FunctionVar_22_0, meta[relay.Constant][44] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 122 /* ty=int32 */, 132 /* ty=int32 */, 0.227132f /* ty=float32 */, 0.00162901f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %49 = nn.bias_add(%48, meta[relay.Constant][45] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%49, 0.000370001f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %177 = %176(%175) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %178 = fn (%FunctionVar_21_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] { %46 = qnn.conv2d(%FunctionVar_21_0, meta[relay.Constant][42] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 142 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0308997f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */; %47 = nn.bias_add(%46, meta[relay.Constant][43] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */; qnn.requantize(%47, 0.000727024f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */ }; %179 = %178(%177) /* ty=Tensor[(1, 14, 14, 384), uint8] */; %180 = fn (%FunctionVar_20_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] { %44 = qnn.conv2d(%FunctionVar_20_0, meta[relay.Constant][40] /* ty=Tensor[(1, 1, 384, 96), uint8] */, 0 /* ty=int32 */, 128 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00727967f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */; %45 = nn.bias_add(%44, meta[relay.Constant][41] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */; qnn.requantize(%45, 0.000171279f /* ty=float32 */, 0 /* ty=int32 */, 0.172015f /* ty=float32 */, 128 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */ }; %181 = %180(%179) /* ty=Tensor[(1, 14, 14, 96), uint8] */; %182 = fn (%FunctionVar_19_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] { %42 = qnn.conv2d(%FunctionVar_19_0, meta[relay.Constant][38] /* ty=Tensor[(1, 1, 96, 576), uint8] */, 128 /* ty=int32 */, 131 /* ty=int32 */, 0.172015f /* ty=float32 */, 0.00161979f /* ty=float32 */, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */; %43 = nn.bias_add(%42, meta[relay.Constant][39] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */; qnn.requantize(%43, 0.000278629f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */ }; %183 = %182(%181) /* ty=Tensor[(1, 14, 14, 576), uint8] */; %184 = fn (%FunctionVar_18_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] { %40 = qnn.conv2d(%FunctionVar_18_0, meta[relay.Constant][36] /* ty=Tensor[(3, 3, 576, 1), uint8] */, 0 /* ty=int32 */, 66 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0708156f /* ty=float32 */, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */; %41 = nn.bias_add(%40, meta[relay.Constant][37] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */; qnn.requantize(%41, 0.00166618f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */ }; %185 = %184(%183) /* ty=Tensor[(1, 14, 14, 576), uint8] */; %186 = fn (%FunctionVar_17_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] { %38 = qnn.conv2d(%FunctionVar_17_0, meta[relay.Constant][34] /* ty=Tensor[(1, 1, 576, 96), uint8] */, 0 /* ty=int32 */, 135 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00841983f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */; %39 = nn.bias_add(%38, meta[relay.Constant][35] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */; qnn.requantize(%39, 0.000198106f /* ty=float32 */, 0 /* ty=int32 */, 0.128486f /* ty=float32 */, 127 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */ }; %187 = %186(%185) /* ty=Tensor[(1, 14, 14, 96), uint8] */; %188 = qnn.add(%187, %181, 0.128486f /* ty=float32 */, 127 /* ty=int32 */, 0.172015f /* ty=float32 */, 128 /* ty=int32 */, 0.179783f /* ty=float32 */, 126 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 96), uint8] */; %189 = fn (%FunctionVar_16_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] { %36 = qnn.conv2d(%FunctionVar_16_0, meta[relay.Constant][32] /* ty=Tensor[(1, 1, 96, 576), uint8] */, 126 /* ty=int32 */, 138 /* ty=int32 */, 0.179783f /* ty=float32 */, 0.00180177f /* ty=float32 */, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */; %37 = nn.bias_add(%36, meta[relay.Constant][33] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */; qnn.requantize(%37, 0.000323928f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */ }; %190 = %189(%188) /* ty=Tensor[(1, 14, 14, 576), uint8] */; %191 = fn (%FunctionVar_15_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] { %34 = qnn.conv2d(%FunctionVar_15_0, meta[relay.Constant][30] /* ty=Tensor[(3, 3, 576, 1), uint8] */, 0 /* ty=int32 */, 154 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0698695f /* ty=float32 */, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */; %35 = nn.bias_add(%34, meta[relay.Constant][31] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */; qnn.requantize(%35, 0.00164392f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */ }; %192 = %191(%190) /* ty=Tensor[(1, 14, 14, 576), uint8] */; %193 = fn (%FunctionVar_14_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] { %32 = qnn.conv2d(%FunctionVar_14_0, meta[relay.Constant][28] /* ty=Tensor[(1, 1, 576, 96), uint8] */, 0 /* ty=int32 */, 155 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0236749f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */; %33 = nn.bias_add(%32, meta[relay.Constant][29] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */; qnn.requantize(%33, 0.000557034f /* ty=float32 */, 0 /* ty=int32 */, 0.190479f /* ty=float32 */, 127 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */ }; %194 = %193(%192) /* ty=Tensor[(1, 14, 14, 96), uint8] */; %195 = qnn.add(%194, %188, 0.190479f /* ty=float32 */, 127 /* ty=int32 */, 0.179783f /* ty=float32 */, 126 /* ty=int32 */, 0.245143f /* ty=float32 */, 126 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 96), uint8] */; %196 = fn (%FunctionVar_13_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] { %30 = qnn.conv2d(%FunctionVar_13_0, meta[relay.Constant][26] /* ty=Tensor[(1, 1, 96, 576), uint8] */, 126 /* ty=int32 */, 125 /* ty=int32 */, 0.245143f /* ty=float32 */, 0.00139799f /* ty=float32 */, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */; %31 = nn.bias_add(%30, meta[relay.Constant][27] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */; qnn.requantize(%31, 0.000342707f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */ }; %197 = %196(%195) /* ty=Tensor[(1, 14, 14, 576), uint8] */; %198 = fn (%FunctionVar_12_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 576), uint8] { %28 = qnn.conv2d(%FunctionVar_12_0, meta[relay.Constant][24] /* ty=Tensor[(3, 3, 576, 1), uint8] */, 0 /* ty=int32 */, 92 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0148872f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 576), int32] */; %29 = nn.bias_add(%28, meta[relay.Constant][25] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 576), int32] */; qnn.requantize(%29, 0.000350273f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 576), uint8] */ }; %199 = %198(%197) /* ty=Tensor[(1, 7, 7, 576), uint8] */; %200 = fn (%FunctionVar_11_0: Tensor[(1, 7, 7, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] { %26 = qnn.conv2d(%FunctionVar_11_0, meta[relay.Constant][22] /* ty=Tensor[(1, 1, 576, 160), uint8] */, 0 /* ty=int32 */, 139 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00922072f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */; %27 = nn.bias_add(%26, meta[relay.Constant][23] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */; qnn.requantize(%27, 0.00021695f /* ty=float32 */, 0 /* ty=int32 */, 0.131885f /* ty=float32 */, 131 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */ }; %201 = %200(%199) /* ty=Tensor[(1, 7, 7, 160), uint8] */; %202 = fn (%FunctionVar_10_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] { %24 = qnn.conv2d(%FunctionVar_10_0, meta[relay.Constant][20] /* ty=Tensor[(1, 1, 160, 960), uint8] */, 131 /* ty=int32 */, 141 /* ty=int32 */, 0.131885f /* ty=float32 */, 0.00211018f /* ty=float32 */, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */; %25 = nn.bias_add(%24, meta[relay.Constant][21] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */; qnn.requantize(%25, 0.000278301f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */ }; %203 = %202(%201) /* ty=Tensor[(1, 7, 7, 960), uint8] */; %204 = fn (%FunctionVar_9_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] { %22 = qnn.conv2d(%FunctionVar_9_0, meta[relay.Constant][18] /* ty=Tensor[(3, 3, 960, 1), uint8] */, 0 /* ty=int32 */, 146 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0409658f /* ty=float32 */, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */; %23 = nn.bias_add(%22, meta[relay.Constant][19] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */; qnn.requantize(%23, 0.000963862f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */ }; %205 = %204(%203) /* ty=Tensor[(1, 7, 7, 960), uint8] */; %206 = fn (%FunctionVar_8_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] { %20 = qnn.conv2d(%FunctionVar_8_0, meta[relay.Constant][16] /* ty=Tensor[(1, 1, 960, 160), uint8] */, 0 /* ty=int32 */, 136 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00783742f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */; %21 = nn.bias_add(%20, meta[relay.Constant][17] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */; qnn.requantize(%21, 0.000184403f /* ty=float32 */, 0 /* ty=int32 */, 0.104162f /* ty=float32 */, 130 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */ }; %207 = %206(%205) /* ty=Tensor[(1, 7, 7, 160), uint8] */; %208 = qnn.add(%207, %201, 0.104162f /* ty=float32 */, 130 /* ty=int32 */, 0.131885f /* ty=float32 */, 131 /* ty=int32 */, 0.15034f /* ty=float32 */, 133 /* ty=int32 */) /* ty=Tensor[(1, 7, 7, 160), uint8] */; %209 = fn (%FunctionVar_7_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] { %18 = qnn.conv2d(%FunctionVar_7_0, meta[relay.Constant][14] /* ty=Tensor[(1, 1, 160, 960), uint8] */, 133 /* ty=int32 */, 129 /* ty=int32 */, 0.15034f /* ty=float32 */, 0.00163117f /* ty=float32 */, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */; %19 = nn.bias_add(%18, meta[relay.Constant][15] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */; qnn.requantize(%19, 0.00024523f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */ }; %210 = %209(%208) /* ty=Tensor[(1, 7, 7, 960), uint8] */; %211 = fn (%FunctionVar_6_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] { %16 = qnn.conv2d(%FunctionVar_6_0, meta[relay.Constant][12] /* ty=Tensor[(3, 3, 960, 1), uint8] */, 0 /* ty=int32 */, 102 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0439425f /* ty=float32 */, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */; %17 = nn.bias_add(%16, meta[relay.Constant][13] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */; qnn.requantize(%17, 0.0010339f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */ }; %212 = %211(%210) /* ty=Tensor[(1, 7, 7, 960), uint8] */; %213 = fn (%FunctionVar_5_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] { %14 = qnn.conv2d(%FunctionVar_5_0, meta[relay.Constant][10] /* ty=Tensor[(1, 1, 960, 160), uint8] */, 0 /* ty=int32 */, 132 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0380282f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */; %15 = nn.bias_add(%14, meta[relay.Constant][11] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */; qnn.requantize(%15, 0.000894746f /* ty=float32 */, 0 /* ty=int32 */, 0.179058f /* ty=float32 */, 134 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */ }; %214 = %213(%212) /* ty=Tensor[(1, 7, 7, 160), uint8] */; %215 = qnn.add(%214, %208, 0.179058f /* ty=float32 */, 134 /* ty=int32 */, 0.15034f /* ty=float32 */, 133 /* ty=int32 */, 0.220417f /* ty=float32 */, 131 /* ty=int32 */) /* ty=Tensor[(1, 7, 7, 160), uint8] */; %216 = fn (%FunctionVar_4_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] { %12 = qnn.conv2d(%FunctionVar_4_0, meta[relay.Constant][8] /* ty=Tensor[(1, 1, 160, 960), uint8] */, 131 /* ty=int32 */, 131 /* ty=int32 */, 0.220417f /* ty=float32 */, 0.00206415f /* ty=float32 */, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */; %13 = nn.bias_add(%12, meta[relay.Constant][9] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */; qnn.requantize(%13, 0.000454974f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */ }; %217 = %216(%215) /* ty=Tensor[(1, 7, 7, 960), uint8] */; %218 = fn (%FunctionVar_3_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] { %10 = qnn.conv2d(%FunctionVar_3_0, meta[relay.Constant][6] /* ty=Tensor[(3, 3, 960, 1), uint8] */, 0 /* ty=int32 */, 201 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.158864f /* ty=float32 */, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */; %11 = nn.bias_add(%10, meta[relay.Constant][7] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */; qnn.requantize(%11, 0.00373784f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */ }; %219 = %218(%217) /* ty=Tensor[(1, 7, 7, 960), uint8] */; %220 = fn (%FunctionVar_2_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 320), uint8] { %8 = qnn.conv2d(%FunctionVar_2_0, meta[relay.Constant][4] /* ty=Tensor[(1, 1, 960, 320), uint8] */, 0 /* ty=int32 */, 111 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00962106f /* ty=float32 */, padding=[0, 0, 0, 0], channels=320, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 320), int32] */; %9 = nn.bias_add(%8, meta[relay.Constant][5] /* ty=Tensor[(320), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 320), int32] */; qnn.requantize(%9, 0.000226369f /* ty=float32 */, 0 /* ty=int32 */, 0.131263f /* ty=float32 */, 143 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 320), uint8] */ }; %221 = %220(%219) /* ty=Tensor[(1, 7, 7, 320), uint8] */; %222 = fn (%FunctionVar_1_0: Tensor[(1, 7, 7, 320), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 1280), uint8] { %6 = qnn.conv2d(%FunctionVar_1_0, meta[relay.Constant][2] /* ty=Tensor[(1, 1, 320, 1280), uint8] */, 143 /* ty=int32 */, 128 /* ty=int32 */, 0.131263f /* ty=float32 */, 0.00524072f /* ty=float32 */, padding=[0, 0, 0, 0], channels=1280, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 1280), int32] */; %7 = nn.bias_add(%6, meta[relay.Constant][3] /* ty=Tensor[(1280), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 1280), int32] */; qnn.requantize(%7, 0.000687913f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 1280), uint8] */ }; %223 = %222(%221) /* ty=Tensor[(1, 7, 7, 1280), uint8] */; %224 = fn (%FunctionVar_0_02: Tensor[(1, 7, 7, 1280), uint8], PartitionedFromPattern="cast_nn.avg_pool2d_cast_", Composite="vsi_npu.qnn_avgpool2d") -> Tensor[(1, 1, 1, 1280), uint8] { %4 = cast(%FunctionVar_0_02, dtype="int32") /* ty=Tensor[(1, 7, 7, 1280), int32] */; %5 = nn.avg_pool2d(%4, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC") /* ty=Tensor[(1, 1, 1, 1280), int32] */; cast(%5, dtype="uint8") /* ty=Tensor[(1, 1, 1, 1280), uint8] */ }; %225 = %224(%223) /* ty=Tensor[(1, 1, 1, 1280), uint8] */; %226 = fn (%FunctionVar_0_01: Tensor[(1, 1, 1, 1280), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 1, 1, 1001), uint8] { %2 = qnn.conv2d(%FunctionVar_0_01, meta[relay.Constant][0] /* ty=Tensor[(1, 1, 1280, 1001), uint8] */, 0 /* ty=int32 */, 114 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00168582f /* ty=float32 */, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 1, 1, 1001), int32] */; %3 = nn.bias_add(%2, meta[relay.Constant][1] /* ty=Tensor[(1001), int32] */, axis=3) /* ty=Tensor[(1, 1, 1, 1001), int32] */; qnn.requantize(%3, 3.96648e-05f /* ty=float32 */, 0 /* ty=int32 */, 0.0760416f /* ty=float32 */, 72 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 1, 1, 1001), uint8] */ }; %227 = %226(%225) /* ty=Tensor[(1, 1, 1, 1001), uint8] */; %228 = reshape(%227, newshape=[1, 1001]) /* ty=Tensor[(1, 1001), uint8] */; %229 = fn (%FunctionVar_0_0: Tensor[(1, 1001), uint8], PartitionedFromPattern="qnn.dequantize_nn.softmax_qnn.quantize_", Composite="vsi_npu.qnn_softmax") -> Tensor[(1, 1001), uint8] { %0 = qnn.dequantize(%FunctionVar_0_0, 0.0760416f /* ty=float32 */, 72 /* ty=int32 */) /* ty=Tensor[(1, 1001), float32] */; %1 = nn.softmax(%0, axis=1) /* ty=Tensor[(1, 1001), float32] */; qnn.quantize(%1, 0.00390625f /* ty=float32 */, 0 /* ty=int32 */, out_dtype="uint8") /* ty=Tensor[(1, 1001), uint8] */ }; %229(%228) /* ty=Tensor[(1, 1001), uint8] */ } This is important----> name_node.value() == tvmgen_default_vsi_npu_0 GraphMakerImpl::Create TensorMakerImpl::InferCall: vsi_npu.qnn_softmax TensorMakerImpl::InferCall: reshape TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.add TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d W [HandleLayoutInfer:268]Op 162: default layout inference pass. VsiNpuModule::GetFunction: get_symbol VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early MBNtest_vsi_tflite_model_all.py:120: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the new recommended usage. graph, lib, params = relay.build(mod, target, params=params) VsiNpuModule::SaveToBinary SaveToBinary: nbg size = 5832768 SaveToBinary: input size = 1 SaveToBinary: output size = 1 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule::SaveToBinary2 Printing device code to device_code.cl... VsiNpuModule::LoadFromBinary LoadFromBinary: nbg size = 5832768 LoadFromBinary: input size = 1 LoadFromBinary: output size = 1 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 (1, 224, 224, 3) ############ VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0 Process Graph: 6 ms or 6253 us VsiNpuModule::GetFunction: size: 2 [[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 112 66 7 1 24 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] ```
https://github.com/VeriSilicon/tvm/blob/vsi_npu/tests/python/contrib/test_vsi_npu/test_vsi_pytorch_model_all.py
I built and cross-compiled for Khadas VIM3 pro through tvm and TIM-VX 1.1.42 on x86_64 simulator. Running the test_vsi_tflite_model_all.py script to test mobilenet_v1_1.0_224_quant.tflite can be executed normally, but there are empty outputs when switching to other tflite models.
TIM-VX version:1.1.42 Using the aarch64_A311D_6.4.10.2 on Khadas VIM3 pro. x86_64_linux prebuilt-sdk version:6.4.10.2
TVM Branch commit id: b822ec32702e2676dce1e430221e8efc05c98935
Here is the Galcore version on Khadas VIM3 pro. (6.4.6.2)
Details
``` khadas@Khadas:~$ sudo dmesg |grep -i galcore [sudo] password for khadas: [ 0.000000] OF: reserved mem: initialized node linux,galcore, compatible id shared-dma-pool [ 15.953889] galcore irq number is 36. [ 15.953891] Galcore version 6.4.0.229426 [ 36.656753] galcore: no symbol version for module_layout [ 36.656799] galcore: loading out-of-tree module taints kernel. [ 36.682762] galcore irq number is 36. [ 36.682770] Galcore version 6.4.6.2 [ 795.670707] [galcore]: GPU[0] hang, automatic recovery. [ 795.675268] [galcore]: recovery done [ 857.110592] [galcore]: GPU[0] hang, automatic recovery. [ 857.115242] [galcore]: recovery done [ 959.510416] [galcore]: GPU[0] hang, automatic recovery. [ 959.515050] [galcore]: recovery done [ 1020.950307] [galcore]: GPU[0] hang, automatic recovery. [ 1020.954999] [galcore]: recovery done [30594.015764] [galcore]: GPU[0] hang, automatic recovery. [30594.020508] [galcore]: recovery done [30655.455503] [galcore]: GPU[0] hang, automatic recovery. [30655.460118] [galcore]: recovery done ```
Unit tests run successfully on Khadas VIM3 pro.
Details
``` khadas@Khadas:~/TIM-VX-1.1.42/build/install/bin$ ./unit_test Running main() from /home/niuniu/TIM-VX-1.1.42/build/_deps/googletest-src/googletest/src/gtest_main.cc [==========] Running 175 tests from 59 test suites. [----------] Global test environment set-up. [----------] 1 test from compile_option [ RUN ] compile_option.relax_mode [ OK ] compile_option.relax_mode (1 ms) [----------] 1 test from compile_option (1 ms total) [----------] 1 test from Context [ RUN ] Context.create [ OK ] Context.create (43 ms) [----------] 1 test from Context (44 ms total) [----------] 2 tests from graph [ RUN ] graph.gen_binary_graph_with_empty_graph E [/home/niuniu/TIM-VX-1.1.42/src/tim/vx/internal/src/vsi_nn_graph_optimization.c:_graph_optimization_convert_int8_to_uint8:837]CHECK STATUS(-1:A generic error code, used when no other describes the error.) E [/home/niuniu/TIM-VX-1.1.42/src/tim/vx/internal/src/vsi_nn_graph_optimization.c:vsi_nn_OptimizeGraph:872]CHECK STATUS(-1:A generic error code, used when no other describes the error.) [ OK ] graph.gen_binary_graph_with_empty_graph (7 ms) [ RUN ] graph.gen_binary_graph_with_simple_add [ OK ] graph.gen_binary_graph_with_simple_add (69 ms) [----------] 2 tests from graph (77 ms total) [----------] 2 tests from Linear [ RUN ] Linear.shape_5_1_fp32 [ OK ] Linear.shape_5_1_fp32 (17 ms) [ RUN ] Linear.shape_5_1_fp32_omit_b [ OK ] Linear.shape_5_1_fp32_omit_b (32 ms) [----------] 2 tests from Linear (51 ms total) [----------] 2 tests from Gelu [ RUN ] Gelu.shape_5_1_fp32_approximate [ OK ] Gelu.shape_5_1_fp32_approximate (225 ms) [ RUN ] Gelu.shape_5_1_uint8_Quantized [ OK ] Gelu.shape_5_1_uint8_Quantized (11 ms) [----------] 2 tests from Gelu (237 ms total) [----------] 1 test from HardSigmoid [ RUN ] HardSigmoid.shape_5_1_uint8_Quantized [ OK ] HardSigmoid.shape_5_1_uint8_Quantized (7 ms) [----------] 1 test from HardSigmoid (7 ms total) [----------] 1 test from Elu [ RUN ] Elu.shape_5_1_fp32 [ OK ] Elu.shape_5_1_fp32 (74 ms) [----------] 1 test from Elu (74 ms total) [----------] 3 tests from AddN [ RUN ] AddN.shape_2_2_int32 [ OK ] AddN.shape_2_2_int32 (22 ms) [ RUN ] AddN.shape_3_1_float32 [ OK ] AddN.shape_3_1_float32 (12 ms) [ RUN ] AddN.shape_2_2_uint8_Quantized [ OK ] AddN.shape_2_2_uint8_Quantized (70 ms) [----------] 3 tests from AddN (104 ms total) [----------] 2 tests from ArgMax [ RUN ] ArgMax.shape_2_2_axis_0 [ OK ] ArgMax.shape_2_2_axis_0 (58 ms) [ RUN ] ArgMax.shape_2_2_axis_1 [ OK ] ArgMax.shape_2_2_axis_1 (56 ms) [----------] 2 tests from ArgMax (115 ms total) [----------] 2 tests from ArgMin [ RUN ] ArgMin.shape_2_2_axis_0 [ OK ] ArgMin.shape_2_2_axis_0 (68 ms) [ RUN ] ArgMin.shape_2_2_axis_1 [ OK ] ArgMin.shape_2_2_axis_1 (56 ms) [----------] 2 tests from ArgMin (124 ms total) [----------] 4 tests from AVG [ RUN ] AVG.shape_3_3_1_2_fp32_kernel_2_stride_1 [ OK ] AVG.shape_3_3_1_2_fp32_kernel_2_stride_1 (69 ms) [ RUN ] AVG.shape_3_3_1_1_fp32_kernel_2_stride_1 [ OK ] AVG.shape_3_3_1_1_fp32_kernel_2_stride_1 (12 ms) [ RUN ] AVG.shape_3_3_1_1_uint8_kernel_2_stride_1 [ OK ] AVG.shape_3_3_1_1_uint8_kernel_2_stride_1 (7 ms) [ RUN ] AVG.shape_60_52_3_5_fp32_kernel_35_stride_5 [ OK ] AVG.shape_60_52_3_5_fp32_kernel_35_stride_5 (45 ms) [----------] 4 tests from AVG (135 ms total) [----------] 2 tests from AVG_ANDROID [ RUN ] AVG_ANDROID.shape_60_52_3_5_fp32_kernel_35_stride_5 [ OK ] AVG_ANDROID.shape_60_52_3_5_fp32_kernel_35_stride_5 (54 ms) [ RUN ] AVG_ANDROID.shape_60_52_3_5_uint8_kernel_35_stride_5 [ OK ] AVG_ANDROID.shape_60_52_3_5_uint8_kernel_35_stride_5 (59 ms) [----------] 2 tests from AVG_ANDROID (113 ms total) [----------] 1 test from Batch2Space [ RUN ] Batch2Space.shape_1_1_3_4_fp32_whcn [ OK ] Batch2Space.shape_1_1_3_4_fp32_whcn (7 ms) [----------] 1 test from Batch2Space (7 ms total) [----------] 2 tests from BatchNorm [ RUN ] BatchNorm.shape_3_3_2_1_fp32_cwhn [ OK ] BatchNorm.shape_3_3_2_1_fp32_cwhn (103 ms) [ RUN ] BatchNorm.shape_3_3_2_1_fp32_whcn [ OK ] BatchNorm.shape_3_3_2_1_fp32_whcn (10 ms) [----------] 2 tests from BatchNorm (114 ms total) [----------] 3 tests from Conv1d [ RUN ] Conv1d.shape_3_6_1_float_ksize_1_stride_1_weights_3_no_bias_wcn [ OK ] Conv1d.shape_3_6_1_float_ksize_1_stride_1_weights_3_no_bias_wcn (58 ms) [ RUN ] Conv1d.shape_6_2_1_uint8_ksize_6_stride_1_weights_2_wcn [ OK ] Conv1d.shape_6_2_1_uint8_ksize_6_stride_1_weights_2_wcn (11 ms) [ RUN ] Conv1d.shape_6_2_1_uint8_ksize_3_stride_1_pad_1_weights_2_no_bias_wcn [ OK ] Conv1d.shape_6_2_1_uint8_ksize_3_stride_1_pad_1_weights_2_no_bias_wcn (8 ms) [----------] 3 tests from Conv1d (77 ms total) [----------] 20 tests from Conv2d [ RUN ] Conv2d.shape_4_2_1_1_float32_PaddingTest [ OK ] Conv2d.shape_4_2_1_1_float32_PaddingTest (60 ms) [ RUN ] Conv2d.shape_4_2_2_2_float32_PointwiseTest [ OK ] Conv2d.shape_4_2_2_2_float32_PointwiseTest (61 ms) [ RUN ] Conv2d.shape_4_2_1_2_float32_SimpleTest [ OK ] Conv2d.shape_4_2_1_2_float32_SimpleTest (22 ms) [ RUN ] Conv2d.shape_4_2_2_2_float32_SimpleChannelsTest [ OK ] Conv2d.shape_4_2_2_2_float32_SimpleChannelsTest (32 ms) [ RUN ] Conv2d.shape_6_3_1_1_float32_SimpleAnisotropicStridesTest [ OK ] Conv2d.shape_6_3_1_1_float32_SimpleAnisotropicStridesTest (20 ms) [ RUN ] Conv2d.shape_4_3_1_1_float32_HandCalculatedTest [ OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedTest (16 ms) [ RUN ] Conv2d.shape_4_3_1_1_float32_HandCalculatedConstFilterTest [ OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedConstFilterTest (16 ms) [ RUN ] Conv2d.shape_4_3_1_1_float32_HandCalculatedBiasTest [ OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedBiasTest (24 ms) [ RUN ] Conv2d.shape_4_3_1_1_float32_HandCalculatedValidTest [ OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedValidTest (26 ms) [ RUN ] Conv2d.shape_4_2_2_2_float32_DisabledPointwiseMultifilterTest [ OK ] Conv2d.shape_4_2_2_2_float32_DisabledPointwiseMultifilterTest (13 ms) [ RUN ] Conv2d.shape_9_9_1_1_float32_SimpleDilationTest [ OK ] Conv2d.shape_9_9_1_1_float32_SimpleDilationTest (19 ms) [ RUN ] Conv2d.shape_4_2_1_2_float32_StrideTest [ OK ] Conv2d.shape_4_2_1_2_float32_StrideTest (16 ms) [ RUN ] Conv2d.shape_4_2_1_2_float32_InputAndFilterSameWidthHeightTest [ OK ] Conv2d.shape_4_2_1_2_float32_InputAndFilterSameWidthHeightTest (14 ms) [ RUN ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest1 [ OK ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest1 (10 ms) [ RUN ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest2 [ OK ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest2 (28 ms) [ RUN ] Conv2d.shape_6_3_1_1_uint8_AnisotropicStridesQuantizedTest [ OK ] Conv2d.shape_6_3_1_1_uint8_AnisotropicStridesQuantizedTest (14 ms) [ RUN ] Conv2d.shape_9_9_1_1_uint8_DilationQuantizedTest [ OK ] Conv2d.shape_9_9_1_1_uint8_DilationQuantizedTest (12 ms) [ RUN ] Conv2d.shape_3_2_2_1_int8_QuantizedPerTensorTest [ OK ] Conv2d.shape_3_2_2_1_int8_QuantizedPerTensorTest (136 ms) [ RUN ] Conv2d.shape_3_2_2_1_int8_QuantizedPerChannelTest [ OK ] Conv2d.shape_3_2_2_1_int8_QuantizedPerChannelTest (488 ms) [ RUN ] Conv2d.shape_w_h_128_1_ksize_1_1_stride_2_int8_QuantizedPerChannelTest [ OK ] Conv2d.shape_w_h_128_1_ksize_1_1_stride_2_int8_QuantizedPerChannelTest (8720 ms) [----------] 20 tests from Conv2d (9751 ms total) [----------] 2 tests from Conv3d [ RUN ] Conv3d.shape_1_1_2_3_3_float32_simple_whdcn [ OK ] Conv3d.shape_1_1_2_3_3_float32_simple_whdcn (61 ms) [ RUN ] Conv3d.shape_1_1_2_3_3_float32_simple_cwhdn [ OK ] Conv3d.shape_1_1_2_3_3_float32_simple_cwhdn (29 ms) [----------] 2 tests from Conv3d (90 ms total) [----------] 2 tests from DeConv1d [ RUN ] DeConv1d.no_bias_layout_whcn_depthwise_shape_3_2_1 [ OK ] DeConv1d.no_bias_layout_whcn_depthwise_shape_3_2_1 (87 ms) [ RUN ] DeConv1d.layout_whcn_shape_3_1_1 [ OK ] DeConv1d.layout_whcn_shape_3_1_1 (83 ms) [----------] 2 tests from DeConv1d (170 ms total) [----------] 2 tests from DeConv2d [ RUN ] DeConv2d.shape_3_3_2_1_float_depthwise [ OK ] DeConv2d.shape_3_3_2_1_float_depthwise (17 ms) [ RUN ] DeConv2d.shape_3_3_1_1_float [ OK ] DeConv2d.shape_3_3_1_1_float (14 ms) [----------] 2 tests from DeConv2d (31 ms total) [----------] 16 tests from DepthwiseConv [ RUN ] DepthwiseConv.shape_2_3_2_1_float32_SimpleTest [ OK ] DepthwiseConv.shape_2_3_2_1_float32_SimpleTest (34 ms) [ RUN ] DepthwiseConv.shape_2_3_2_1_float32_StrideValidTest [ OK ] DepthwiseConv.shape_2_3_2_1_float32_StrideValidTest (28 ms) [ RUN ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameTest [ OK ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameTest (39 ms) [ RUN ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameDilationTest [ OK ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameDilationTest (26 ms) [ RUN ] DepthwiseConv.shape_2_3_2_1_float32_PaddingTest [ OK ] DepthwiseConv.shape_2_3_2_1_float32_PaddingTest (19 ms) [ RUN ] DepthwiseConv.shape_9_9_1_1_float32_DilationValidTest [ OK ] DepthwiseConv.shape_9_9_1_1_float32_DilationValidTest (17 ms) [ RUN ] DepthwiseConv.shape_3_3_1_1_float32_DilationSameTest [ OK ] DepthwiseConv.shape_3_3_1_1_float32_DilationSameTest (18 ms) [ RUN ] DepthwiseConv.shape_3_3_4_2_float32_BatchValidTest [ OK ] DepthwiseConv.shape_3_3_4_2_float32_BatchValidTest (42 ms) [ RUN ] DepthwiseConv.shape_2_2_1_4_float32_BatchSameTest [ OK ] DepthwiseConv.shape_2_2_1_4_float32_BatchSameTest (28 ms) [ RUN ] DepthwiseConv.shape_2_3_2_1_uint8_QuantizedTest [ OK ] DepthwiseConv.shape_2_3_2_1_uint8_QuantizedTest (24 ms) [ RUN ] DepthwiseConv.shape_9_9_1_1_uint8_QuantizedDilationdValidTest [ OK ] DepthwiseConv.shape_9_9_1_1_uint8_QuantizedDilationdValidTest (17 ms) [ RUN ] DepthwiseConv.shape_3_3_1_1_uint8_QuantizedDilationdSameTest [ OK ] DepthwiseConv.shape_3_3_1_1_uint8_QuantizedDilationdSameTest (15 ms) [ RUN ] DepthwiseConv.shape_3_2_2_1_int8_PerTensorTest [ OK ] DepthwiseConv.shape_3_2_2_1_int8_PerTensorTest (28 ms) [ RUN ] DepthwiseConv.shape_3_2_2_1_int8_PerAxisTest [ OK ] DepthwiseConv.shape_3_2_2_1_int8_PerAxisTest (141 ms) [ RUN ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelValidTest [ OK ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelValidTest (37 ms) [ RUN ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelSameTest [ OK ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelSameTest (42 ms) [----------] 16 tests from DepthwiseConv (558 ms total) [----------] 3 tests from FloorDiv [ RUN ] FloorDiv.shape_1_fp32 [ OK ] FloorDiv.shape_1_fp32 (245 ms) [ RUN ] FloorDiv.shape_5_1_broadcast_float32 [ OK ] FloorDiv.shape_5_1_broadcast_float32 (78 ms) [ RUN ] FloorDiv.shape_5_1_broadcast_uint8 [ OK ] FloorDiv.shape_5_1_broadcast_uint8 (343 ms) [----------] 3 tests from FloorDiv (666 ms total) [----------] 4 tests from Div [ RUN ] Div.shape_1_fp32 [ OK ] Div.shape_1_fp32 (26 ms) [ RUN ] Div.shape_5_1_broadcast_uint8 [ OK ] Div.shape_5_1_broadcast_uint8 (206 ms) [ RUN ] Div.shape_5_1_broadcast_scale_uint8 [ OK ] Div.shape_5_1_broadcast_scale_uint8 (43 ms) [ RUN ] Div.Div_uint8 [ OK ] Div.Div_uint8 (62 ms) [----------] 4 tests from Div (338 ms total) [----------] 2 tests from Erf [ RUN ] Erf.shape_3_2_fp32 [ OK ] Erf.shape_3_2_fp32 (79 ms) [ RUN ] Erf.shape_3_2_uint8_Quantized [ OK ] Erf.shape_3_2_uint8_Quantized (15 ms) [----------] 2 tests from Erf (95 ms total) [----------] 2 tests from GroupedConv1d [ RUN ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn [ OK ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn (15 ms) [ RUN ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn_PaddingTest [ OK ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn_PaddingTest (17 ms) [----------] 2 tests from GroupedConv1d (32 ms total) [----------] 3 tests from GroupedConv2d [ RUN ] GroupedConv2d.shape_3_3_6_1_float_group_1_no_bias_whcn [ OK ] GroupedConv2d.shape_3_3_6_1_float_group_1_no_bias_whcn (13 ms) [ RUN ] GroupedConv2d.shape_3_3_6_1_float_group_2_whcn [ OK ] GroupedConv2d.shape_3_3_6_1_float_group_2_whcn (14 ms) [ RUN ] GroupedConv2d.shape_3_3_6_1_uint8_group_6_whcn [ OK ] GroupedConv2d.shape_3_3_6_1_uint8_group_6_whcn (31 ms) [----------] 3 tests from GroupedConv2d (59 ms total) [----------] 2 tests from InstanceNorm [ RUN ] InstanceNorm.shape_3_6_1_float [ OK ] InstanceNorm.shape_3_6_1_float (131 ms) [ RUN ] InstanceNorm.shape_3_3_6_1_float [ OK ] InstanceNorm.shape_3_3_6_1_float (124 ms) [----------] 2 tests from InstanceNorm (256 ms total) [----------] 2 tests from LayerNorm [ RUN ] LayerNorm.axis_0_shape_3_6_1_float [ OK ] LayerNorm.axis_0_shape_3_6_1_float (83 ms) [ RUN ] LayerNorm.axis_0_shape_2_3_6_1_float [ OK ] LayerNorm.axis_0_shape_2_3_6_1_float (86 ms) [----------] 2 tests from LayerNorm (169 ms total) [----------] 3 tests from LogSoftmax [ RUN ] LogSoftmax.shape_6_1_float_axis_0 [ OK ] LogSoftmax.shape_6_1_float_axis_0 (86 ms) [ RUN ] LogSoftmax.shape_3_6_1_float_axis_1 [ OK ] LogSoftmax.shape_3_6_1_float_axis_1 (72 ms) [ RUN ] LogSoftmax.shape_3_6_1_uint8_axis_1 [ OK ] LogSoftmax.shape_3_6_1_uint8_axis_1 (996 ms) [----------] 3 tests from LogSoftmax (1154 ms total) [----------] 4 tests from Matmul [ RUN ] Matmul.shape_2_6_shape_6_2_float [ OK ] Matmul.shape_2_6_shape_6_2_float (50 ms) [ RUN ] Matmul.shape_3_1_shape_1_3_float [ OK ] Matmul.shape_3_1_shape_1_3_float (73 ms) [ RUN ] Matmul.shape_2_3_2_shape_2_3_2_float_transpose_b [ OK ] Matmul.shape_2_3_2_shape_2_3_2_float_transpose_b (69 ms) [ RUN ] Matmul.shape_2_3_2_shape_2_3_2_uint8_transpose_a [ OK ] Matmul.shape_2_3_2_shape_2_3_2_uint8_transpose_a (235 ms) [----------] 4 tests from Matmul (427 ms total) [----------] 2 tests from MaxpoolWithArgmax [ RUN ] MaxpoolWithArgmax.shape_3_3_1_fp32_kernel_2_stride_2 [ OK ] MaxpoolWithArgmax.shape_3_3_1_fp32_kernel_2_stride_2 (93 ms) [ RUN ] MaxpoolWithArgmax.shape_4_4_1_uint8_kernel_2_stride_2 [ OK ] MaxpoolWithArgmax.shape_4_4_1_uint8_kernel_2_stride_2 (198 ms) [----------] 2 tests from MaxpoolWithArgmax (292 ms total) [----------] 2 tests from MaxUnpool2d [ RUN ] MaxUnpool2d.shape_2_2_1_fp32_kernel_2_stride_2 [ OK ] MaxUnpool2d.shape_2_2_1_fp32_kernel_2_stride_2 (77 ms) [ RUN ] MaxUnpool2d.shape_2_2_1_uint8_kernel_2_stride_2 [ OK ] MaxUnpool2d.shape_2_2_1_uint8_kernel_2_stride_2 (223 ms) [----------] 2 tests from MaxUnpool2d (300 ms total) [----------] 2 tests from Moments [ RUN ] Moments.shape_6_3_1_float_axes_0_1 [ OK ] Moments.shape_6_3_1_float_axes_0_1 (112 ms) [ RUN ] Moments.shape_3_6_1_float_axes_1_keepdims [ OK ] Moments.shape_3_6_1_float_axes_1_keepdims (61 ms) [----------] 2 tests from Moments (173 ms total) [----------] 9 tests from OneHot [ RUN ] OneHot.shape_3_out_flaot_depth_3 [ OK ] OneHot.shape_3_out_flaot_depth_3 (52 ms) [ RUN ] OneHot.shape_3_out_int32_depth_3 [ OK ] OneHot.shape_3_out_int32_depth_3 (66 ms) [ RUN ] OneHot.shape_3_out_int8_depth_3 [ OK ] OneHot.shape_3_out_int8_depth_3 (51 ms) [ RUN ] OneHot.shape_3_out_uint8_depth_3 [ OK ] OneHot.shape_3_out_uint8_depth_3 (58 ms) [ RUN ] OneHot.shape_3_out_int32_depth_1 [ OK ] OneHot.shape_3_out_int32_depth_1 (94 ms) [ RUN ] OneHot.shape_3_out_int32_depth_4 [ OK ] OneHot.shape_3_out_int32_depth_4 (51 ms) [ RUN ] OneHot.shape_3_out_int32_depth_3_on_6_off_N1 [ OK ] OneHot.shape_3_out_int32_depth_3_on_6_off_N1 (54 ms) [ RUN ] OneHot.shape_3_out_int32_depth_3_on_5_off_0_axis_1 [ OK ] OneHot.shape_3_out_int32_depth_3_on_5_off_0_axis_1 (98 ms) [ RUN ] OneHot.shape_2_2_out_int32_depth_3_on_2_off_0 [ OK ] OneHot.shape_2_2_out_int32_depth_3_on_2_off_0 (51 ms) [----------] 9 tests from OneHot (578 ms total) [----------] 1 test from Equal [ RUN ] Equal.shape_1_uint8 [ OK ] Equal.shape_1_uint8 (634 ms) [----------] 1 test from Equal (634 ms total) [----------] 1 test from NotEqual [ RUN ] NotEqual.shape_5_fp32 [ OK ] NotEqual.shape_5_fp32 (80 ms) [----------] 1 test from NotEqual (80 ms total) [----------] 1 test from Less [ RUN ] Less.shape_5_1_fp32 [ OK ] Less.shape_5_1_fp32 (91 ms) [----------] 1 test from Less (91 ms total) [----------] 1 test from GreaterOrEqual [ RUN ] GreaterOrEqual.shape_5_2_1_fp32 [ OK ] GreaterOrEqual.shape_5_2_1_fp32 (134 ms) [----------] 1 test from GreaterOrEqual (134 ms total) [----------] 1 test from Greater [ RUN ] Greater.shape_5_2_1_1_fp32 [ OK ] Greater.shape_5_2_1_1_fp32 (98 ms) [----------] 1 test from Greater (98 ms total) [----------] 1 test from LessOrEqual [ RUN ] LessOrEqual.shape_1_5_2_1_1_fp32 [ OK ] LessOrEqual.shape_1_5_2_1_1_fp32 (107 ms) [----------] 1 test from LessOrEqual (108 ms total) [----------] 2 tests from Reorg [ RUN ] Reorg.shape_4_4_4_1_u8 [ OK ] Reorg.shape_4_4_4_1_u8 (12 ms) [ RUN ] Reorg.shape_4_4_4_1_fp32 [ OK ] Reorg.shape_4_4_4_1_fp32 (34 ms) [----------] 2 tests from Reorg (47 ms total) [----------] 3 tests from Resize1d [ RUN ] Resize1d.shape_4_2_1_float_nearest_whcn [ OK ] Resize1d.shape_4_2_1_float_nearest_whcn (61 ms) [ RUN ] Resize1d.shape_4_2_1_uint8_nearest_whcn [ OK ] Resize1d.shape_4_2_1_uint8_nearest_whcn (155 ms) [ RUN ] Resize1d.shape_5_1_1_float_bilinear_align_corners_whcn [ OK ] Resize1d.shape_5_1_1_float_bilinear_align_corners_whcn (53 ms) [----------] 3 tests from Resize1d (270 ms total) [----------] 2 tests from RNNCell [ RUN ] RNNCell.shape_3_2_4_float [ OK ] RNNCell.shape_3_2_4_float (178 ms) [ RUN ] RNNCell.seperate [ OK ] RNNCell.seperate (79 ms) [----------] 2 tests from RNNCell (257 ms total) [----------] 2 tests from ScatterND [ RUN ] ScatterND.shape_4_4_4 [ OK ] ScatterND.shape_4_4_4 (97 ms) [ RUN ] ScatterND.shape_9 [ OK ] ScatterND.shape_9 (154 ms) [----------] 2 tests from ScatterND (251 ms total) [----------] 5 tests from ShuffleChannel [ RUN ] ShuffleChannel.shape_3_6_groupnum2_dim1_float32 [ OK ] ShuffleChannel.shape_3_6_groupnum2_dim1_float32 (26 ms) [ RUN ] ShuffleChannel.shape_4_2_2_groupnum2_dim0_float32 [ OK ] ShuffleChannel.shape_4_2_2_groupnum2_dim0_float32 (15 ms) [ RUN ] ShuffleChannel.shape_1_4_2_2_groupnum2_dim1_float32 [ OK ] ShuffleChannel.shape_1_4_2_2_groupnum2_dim1_float32 (23 ms) [ RUN ] ShuffleChannel.shape_4_1_2_2_groupnum4_dim0_float32 [ OK ] ShuffleChannel.shape_4_1_2_2_groupnum4_dim0_float32 (32 ms) [ RUN ] ShuffleChannel.shape_4_1_2_2_groupnum1_dim3_float32 [ OK ] ShuffleChannel.shape_4_1_2_2_groupnum1_dim3_float32 (26 ms) [----------] 5 tests from ShuffleChannel (123 ms total) [----------] 1 test from SignalFrame [ RUN ] SignalFrame.shape_10_3_float_step_2_windows_4 [ OK ] SignalFrame.shape_10_3_float_step_2_windows_4 (59 ms) [----------] 1 test from SignalFrame (60 ms total) [----------] 1 test from Floor [ RUN ] Floor.shape_5_1_fp32 [ OK ] Floor.shape_5_1_fp32 (12 ms) [----------] 1 test from Floor (12 ms total) [----------] 1 test from Cast [ RUN ] Cast.shape_5_1_fp32_to_int32 [ OK ] Cast.shape_5_1_fp32_to_int32 (58 ms) [----------] 1 test from Cast (58 ms total) [----------] 3 tests from DataConvert [ RUN ] DataConvert.quantize_shape_2_3_fp32_to_asym_u8 [ OK ] DataConvert.quantize_shape_2_3_fp32_to_asym_u8 (28 ms) [ RUN ] DataConvert.dequantize_shape_2_3_asym_u8_to_fp32 [ OK ] DataConvert.dequantize_shape_2_3_asym_u8_to_fp32 (22 ms) [ RUN ] DataConvert.requantize_shape_2_3_asym_u8 [ OK ] DataConvert.requantize_shape_2_3_asym_u8 (13 ms) [----------] 3 tests from DataConvert (63 ms total) [----------] 6 tests from Softmax [ RUN ] Softmax.shape_3_1_float_axis_0 [ OK ] Softmax.shape_3_1_float_axis_0 (30 ms) [ RUN ] Softmax.shape_3_4_float_axis_0 [ OK ] Softmax.shape_3_4_float_axis_0 (22 ms) [ RUN ] Softmax.shape_3_4_float_axis_1 [ OK ] Softmax.shape_3_4_float_axis_1 (23 ms) [ RUN ] Softmax.shape_3_3_2_float_axis_0 [ OK ] Softmax.shape_3_3_2_float_axis_0 (10 ms) [ RUN ] Softmax.shape_3_3_2_float_axis_1 [ OK ] Softmax.shape_3_3_2_float_axis_1 (13 ms) [ RUN ] Softmax.shape_3_3_2_float_axis_2 [ OK ] Softmax.shape_3_3_2_float_axis_2 (10 ms) [----------] 6 tests from Softmax (108 ms total) [----------] 1 test from Space2Batch [ RUN ] Space2Batch.shape_2_2_3_1_fp32_whcn [ OK ] Space2Batch.shape_2_2_3_1_fp32_whcn (9 ms) [----------] 1 test from Space2Batch (9 ms total) [----------] 1 test from SpatialTransformer [ RUN ] SpatialTransformer.shape_1_3_3_1_u8 [ OK ] SpatialTransformer.shape_1_3_3_1_u8 (336 ms) [----------] 1 test from SpatialTransformer (336 ms total) [----------] 6 tests from Stack [ RUN ] Stack.shape_3_4_axis_2 [ OK ] Stack.shape_3_4_axis_2 (50 ms) [ RUN ] Stack.shape_3_4_axis_1 [ OK ] Stack.shape_3_4_axis_1 (19 ms) [ RUN ] Stack.shape_3_4_axis_0 [ OK ] Stack.shape_3_4_axis_0 (18 ms) [ RUN ] Stack.LayoutinferernceTest_1 [ OK ] Stack.LayoutinferernceTest_1 (27 ms) [ RUN ] Stack.LayoutinferernceTest_2 [ OK ] Stack.LayoutinferernceTest_2 (94 ms) [ RUN ] Stack.LayoutinferernceTest_3 [ OK ] Stack.LayoutinferernceTest_3 (138 ms) [----------] 6 tests from Stack (349 ms total) [----------] 1 test from StridedSlice [ RUN ] StridedSlice.shape_ [ OK ] StridedSlice.shape_ (45 ms) [----------] 1 test from StridedSlice (45 ms total) [----------] 3 tests from Svdf [ RUN ] Svdf.shape_3_2_10_1_4_float [ OK ] Svdf.shape_3_2_10_1_4_float (20 ms) [ RUN ] Svdf.shape_3_2_10_2_4_float [ OK ] Svdf.shape_3_2_10_2_4_float (24 ms) [ RUN ] Svdf.shape_3_2_10_3_4_float [ OK ] Svdf.shape_3_2_10_3_4_float (21 ms) [----------] 3 tests from Svdf (65 ms total) [----------] 2 tests from Tile [ RUN ] Tile.shape_3_2_float_multiples_2_1 [ OK ] Tile.shape_3_2_float_multiples_2_1 (70 ms) [ RUN ] Tile.shape_3_2_1_int8_multiples_2_2_1 [ OK ] Tile.shape_3_2_1_int8_multiples_2_2_1 (390 ms) [----------] 2 tests from Tile (461 ms total) [----------] 14 tests from TransposeConv2d [ RUN ] TransposeConv2d.shape_4_4_1_1_float32_SimpleTest [ OK ] TransposeConv2d.shape_4_4_1_1_float32_SimpleTest (22 ms) [ RUN ] TransposeConv2d.shape_4_4_2_1_float32_SameTest [ OK ] TransposeConv2d.shape_4_4_2_1_float32_SameTest (17 ms) [ RUN ] TransposeConv2d.shape_4_4_2_1_float32_ValidTest [ OK ] TransposeConv2d.shape_4_4_2_1_float32_ValidTest (20 ms) [ RUN ] TransposeConv2d.shape_2_2_1_1_float32_StrideTest [ OK ] TransposeConv2d.shape_2_2_1_1_float32_StrideTest (28 ms) [ RUN ] TransposeConv2d.shape_2_2_1_1_float32_ChannelTest [ OK ] TransposeConv2d.shape_2_2_1_1_float32_ChannelTest (16 ms) [ RUN ] TransposeConv2d.shape_2_1_1_1_float32_AccuracyTest [ OK ] TransposeConv2d.shape_2_1_1_1_float32_AccuracyTest (26 ms) [ RUN ] TransposeConv2d.shape_2_2_1_1_float32_BiasChannelTest [ OK ] TransposeConv2d.shape_2_2_1_1_float32_BiasChannelTest (36 ms) [ RUN ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedTest [ OK ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedTest (23 ms) [ RUN ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedTwoFiltersTest [ OK ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedTwoFiltersTest (25 ms) [ RUN ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedValidTest [ OK ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedValidTest (26 ms) [ RUN ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedBiasTest [ OK ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedBiasTest (17 ms) [ RUN ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedPerChannelOneTest [ OK ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedPerChannelOneTest (102 ms) [ RUN ] TransposeConv2d.shape_2_2_1_1_int8_QuantizedPerChannelTwoTest [ OK ] TransposeConv2d.shape_2_2_1_1_int8_QuantizedPerChannelTwoTest (120 ms) [ RUN ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedBiasPerChannelTest [ OK ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedBiasPerChannelTest (101 ms) [----------] 14 tests from TransposeConv2d (582 ms total) [----------] 1 test from LSTM_CELL [ RUN ] LSTM_CELL.shape_in_2_cell_4_out_4_float32 W [downcast_act_type:46]Not supported activition type for LSTM = 0 [ OK ] LSTM_CELL.shape_in_2_cell_4_out_4_float32 (152 ms) [----------] 1 test from LSTM_CELL (152 ms total) [----------] 2 tests from Unstack [ RUN ] Unstack.shape_4_3_axis_0 [ OK ] Unstack.shape_4_3_axis_0 (15 ms) [ RUN ] Unstack.shape_4_3_axis_1 [ OK ] Unstack.shape_4_3_axis_1 (10 ms) [----------] 2 tests from Unstack (26 ms total) [----------] 1 test from LayoutInference [ RUN ] LayoutInference.simple_conv2d [ OK ] LayoutInference.simple_conv2d (23 ms) [----------] 1 test from LayoutInference (24 ms total) [----------] Global test environment tear-down [==========] 175 tests from 59 test suites ran. (20865 ms total) [ PASSED ] 175 tests. ```
Here is the output when running test_vsi_pytorch_model_all.py with the quantized tflite model InceptionNetV1.
x86_64 Host
``` #productname=VSI SIMULATOR, pid=0x88 1. press any key and continue... vsi_npu.py --> qnn.dequantize vsi_npu.py --> nn.softmax vsi_npu.py --> qnn.quantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.avg_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> qnn.requantize vsi_npu.py --> nn.max_pool2d vsi_npu.py --> qnn.concatenate vsi_npu.py --> reshape This is important----> name_node.value() == tvmgen_default_vsi_npu_0 GraphMakerImpl::Create graph gpuCount=1 interConnectRingCount=0 NN ring buffer is disabled TensorMakerImpl::InferCall: vsi_npu.qnn_softmax TensorMakerImpl::InferCall: reshape TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: qnn.concatenate TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d TensorMakerImpl::InferCall: nn.max_pool2d GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 GraphMakerImpl::VisitExpr_(TupleNode): 4 graph gpuCount=1 interConnectRingCount=0 NN ring buffer is disabled W [HandleLayoutInfer:268]Op 162: default layout inference pass. ---------------------------Begin VerifyTiling ------------------------- AXI-SRAM = 1048320 Bytes VIP-SRAM = 522240 Bytes SWTILING_PHASE_FEATURES[0, 0, 0] 0 TP [( 3 224 224 1, 150528, 0x0x327d810(0x0x327d810, 0x(nil)) -> 224 224 3 1, 150528, 0x0x3390940(0x0x3390940, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(1 1, 1 1)] C[ 1] 1 TP [( 224 224 3 1, 150528, 0x0x3390940(0x0x3390940, 0x(nil)) -> 115 115 12 1, 158700, 0x0x722ad10(0x0x722ad10, 0x(nil))) k(0 0 0, 0) pad(2 2) pool(1 1, 1 1)] P[ 0] C[ 2] 2 NN [( 115 115 12 1, 158700, 0x0x722ad10(0x0x722ad10, 0x(nil)) -> 112 112 64 1, 802816, 0x0x33920c0(0x0x33920c0, 0x(nil))) k(4 4 12, 13440) pad(0 0) pool(1 1, 1 1)] P[ 1] C[ 3] 3 TP [( 112 112 64 1, 802816, 0x0x33920c0(0x0x33920c0, 0x(nil)) -> 56 56 64 1, 200704, 0x0x3395390(0x0x3395390, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(3 3, 2 2)] P[ 2] C[ 4] 4 NN [( 56 56 64 1, 200704, 0x0x3395390(0x0x3395390, 0x(nil)) -> 56 56 64 1, 200704, 0x0x3397ec0(0x0x3397ec0, 0x(nil))) k(1 1 64, 4608) pad(0 0) pool(1 1, 1 1)] P[ 3] C[ 5] 5 NN [( 56 56 64 1, 200704, 0x0x3397ec0(0x0x3397ec0, 0x(nil)) -> 56 56 192 1, 602112, 0x0x339b7e0(0x0x339b7e0, 0x(nil))) k(3 3 64, 116992) pad(1 1) pool(1 1, 1 1)] P[ 4] C[ 6] 6 TP [( 56 56 192 1, 602112, 0x0x339b7e0(0x0x339b7e0, 0x(nil)) -> 28 28 192 1, 150528, 0x0x339f100(0x0x339f100, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(3 3, 2 2)] P[ 5] C[ 7, 9, 10, 12] 7 TP [( 28 28 192 1, 150528, 0x0x339f100(0x0x339f100, 0x(nil)) -> 28 28 192 1, 150528, 0x0x33ac730(0x0x33ac730, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 6] C[ 8] 8 NN [( 28 28 192 1, 150528, 0x0x33ac730(0x0x33ac730, 0x(nil)) -> 28 28 32 1, 25088, 0x0x33b6550(0x0x33b6550, 0x(nil))) k(1 1 192, 6656) pad(0 0) pool(1 1, 1 1)] P[ 7] C[ 17] 9 NN [( 28 28 192 1, 150528, 0x0x339f100(0x0x339f100, 0x(nil)) -> 28 28 64 1, 50176, 0x0x33a1b70(0x0x33a1b70, 0x(nil))) k(1 1 192, 13184) pad(0 0) pool(1 1, 1 1)] P[ 6] C[ 14] 10 NN [( 28 28 192 1, 150528, 0x0x339f100(0x0x339f100, 0x(nil)) -> 28 28 96 1, 75264, 0x0x33a54b0(0x0x33a54b0, 0x(nil))) k(1 1 192, 19840) pad(0 0) pool(1 1, 1 1)] P[ 6] C[ 11] 11 NN [( 28 28 96 1, 75264, 0x0x33a54b0(0x0x33a54b0, 0x(nil)) -> 28 28 128 1, 100352, 0x0x33af180(0x0x33af180, 0x(nil))) k(3 3 96, 116736) pad(1 1) pool(1 1, 1 1)] P[ 10] C[ 15] 12 NN [( 28 28 192 1, 150528, 0x0x339f100(0x0x339f100, 0x(nil)) -> 28 28 16 1, 12544, 0x0x33a8df0(0x0x33a8df0, 0x(nil))) k(1 1 192, 3328) pad(0 0) pool(1 1, 1 1)] P[ 6] C[ 13] 13 NN [( 28 28 16 1, 12544, 0x0x33a8df0(0x0x33a8df0, 0x(nil)) -> 28 28 32 1, 25088, 0x0x33b2ae0(0x0x33b2ae0, 0x(nil))) k(3 3 16, 4992) pad(1 1) pool(1 1, 1 1)] P[ 12] C[ 16] 14 TP [( 28 28 64 1, 50176, 0x0x33a1b70(0x0x33a1b70, 0x(nil)) -> 28 28 64 1, 200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 9] C[ 18, 20, 21, 23] 15 TP [( 28 28 128 1, 100352, 0x0x33af180(0x0x33af180, 0x(nil)) -> 28 28 128 1, 200704, 0x0x8359fb0(0x0x33b9fb0, 0x0xc400)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 11] C[ 18, 20, 21, 23] 16 TP [( 28 28 32 1, 25088, 0x0x33b2ae0(0x0x33b2ae0, 0x(nil)) -> 28 28 32 1, 200704, 0x0x12299fb0(0x0x33b9fb0, 0x0x24c00)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 13] C[ 18, 20, 21, 23] 17 TP [( 28 28 32 1, 25088, 0x0x33b6550(0x0x33b6550, 0x(nil)) -> 28 28 32 1, 200704, 0x0x14a69fb0(0x0x33b9fb0, 0x0x2ae00)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 8] C[ 18, 20, 21, 23] 18 TP [( 28 28 256 1, 200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) -> 28 28 256 1, 200704, 0x0x33d6130(0x0x33d6130, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 14, 15, 16, 17] C[ 19] 19 NN [( 28 28 256 1, 200704, 0x0x33d6130(0x0x33d6130, 0x(nil)) -> 28 28 64 1, 50176, 0x0x33e01a0(0x0x33e01a0, 0x(nil))) k(1 1 256, 17536) pad(0 0) pool(1 1, 1 1)] P[ 18] C[ 28] 20 NN [( 28 28 256 1, 200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) -> 28 28 128 1, 100352, 0x0x33cb160(0x0x33cb160, 0x(nil))) k(1 1 256, 34944) pad(0 0) pool(1 1, 1 1)] P[ 14, 15, 16, 17] C[ 25] 21 NN [( 28 28 256 1, 200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) -> 28 28 128 1, 100352, 0x0x33cebf0(0x0x33cebf0, 0x(nil))) k(1 1 256, 34944) pad(0 0) pool(1 1, 1 1)] P[ 14, 15, 16, 17] C[ 22] 22 NN [( 28 28 128 1, 100352, 0x0x33cebf0(0x0x33cebf0, 0x(nil)) -> 28 28 192 1, 150528, 0x0x33d8c60(0x0x33d8c60, 0x(nil))) k(3 3 128, 233088) pad(1 1) pool(1 1, 1 1)] P[ 21] C[ 26] 23 NN [( 28 28 256 1, 200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) -> 28 28 32 1, 25088, 0x0x33d2680(0x0x33d2680, 0x(nil))) k(1 1 256, 8832) pad(0 0) pool(1 1, 1 1)] P[ 14, 15, 16, 17] C[ 24] 24 NN [( 28 28 32 1, 25088, 0x0x33d2680(0x0x33d2680, 0x(nil)) -> 28 28 96 1, 75264, 0x0x33dc6f0(0x0x33dc6f0, 0x(nil))) k(3 3 32, 29440) pad(1 1) pool(1 1, 1 1)] P[ 23] C[ 27] 25 TP [( 28 28 128 1, 100352, 0x0x33cb160(0x0x33cb160, 0x(nil)) -> 28 28 128 1, 376320, 0x0x33e3c30(0x0x33e3c30, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 20] C[ 29] 26 TP [( 28 28 192 1, 150528, 0x0x33d8c60(0x0x33d8c60, 0x(nil)) -> 28 28 192 1, 376320, 0x0xd323c30(0x0x33e3c30, 0x0x18800)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 22] C[ 29] 27 TP [( 28 28 96 1, 75264, 0x0x33dc6f0(0x0x33dc6f0, 0x(nil)) -> 28 28 96 1, 376320, 0x0x1c203c30(0x0x33e3c30, 0x0x3d400)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 24] C[ 29] 28 TP [( 28 28 64 1, 50176, 0x0x33e01a0(0x0x33e01a0, 0x(nil)) -> 28 28 64 1, 376320, 0x0x23973c30(0x0x33e3c30, 0x0x4fa00)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 19] C[ 29] 29 TP [( 28 28 480 1, 376320, 0x0x33e3c30(0x0x33e3c30, 0x(nil)) -> 14 14 480 1, 94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(3 3, 2 2)] P[ 25, 26, 27, 28] C[ 30, 32, 33, 35] 30 TP [( 14 14 480 1, 94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) -> 14 14 480 1, 94080, 0x0x3402b20(0x0x3402b20, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 29] C[ 31] 31 NN [( 14 14 480 1, 94080, 0x0x3402b20(0x0x3402b20, 0x(nil)) -> 14 14 64 1, 12544, 0x0x340d850(0x0x340d850, 0x(nil))) k(1 1 480, 32640) pad(0 0) pool(1 1, 1 1)] P[ 30] C[ 40] 32 NN [( 14 14 480 1, 94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) -> 14 14 192 1, 37632, 0x0x33f7b10(0x0x33f7b10, 0x(nil))) k(1 1 480, 97664) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 37] 33 NN [( 14 14 480 1, 94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) -> 14 14 96 1, 18816, 0x0x33fb5c0(0x0x33fb5c0, 0x(nil))) k(1 1 480, 48896) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 34] 34 NN [( 14 14 96 1, 18816, 0x0x33fb5c0(0x0x33fb5c0, 0x(nil)) -> 14 14 208 1, 40768, 0x0x3405670(0x0x3405670, 0x(nil))) k(3 3 96, 189696) pad(1 1) pool(1 1, 1 1)] P[ 33] C[ 38] 35 NN [( 14 14 480 1, 94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) -> 14 14 16 1, 3136, 0x0x33ff070(0x0x33ff070, 0x(nil))) k(1 1 480, 8192) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 36] 36 NN [( 14 14 16 1, 3136, 0x0x33ff070(0x0x33ff070, 0x(nil)) -> 14 14 48 1, 9408, 0x0x34098f0(0x0x34098f0, 0x(nil))) k(3 3 16, 7552) pad(1 1) pool(1 1, 1 1)] P[ 35] C[ 39] 37 TP [( 14 14 192 1, 37632, 0x0x33f7b10(0x0x33f7b10, 0x(nil)) -> 14 14 192 1, 100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 32] C[ 41, 43, 44, 46] 38 TP [( 14 14 208 1, 40768, 0x0x3405670(0x0x3405670, 0x(nil)) -> 14 14 208 1, 100352, 0x0x6fc9ec0(0x0x3411ec0, 0x0x9300)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 34] C[ 41, 43, 44, 46] 39 TP [( 14 14 48 1, 9408, 0x0x34098f0(0x0x34098f0, 0x(nil)) -> 14 14 48 1, 100352, 0x0xb07bec0(0x0x3411ec0, 0x0x13240)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 36] C[ 41, 43, 44, 46] 40 TP [( 14 14 64 1, 12544, 0x0x340d850(0x0x340d850, 0x(nil)) -> 14 14 64 1, 100352, 0x0xbf69ec0(0x0x3411ec0, 0x0x15700)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 31] C[ 41, 43, 44, 46] 41 TP [( 14 14 512 1, 100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) -> 14 14 512 1, 100352, 0x0x342f090(0x0x342f090, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 37, 38, 39, 40] C[ 42] 42 NN [( 14 14 512 1, 100352, 0x0x342f090(0x0x342f090, 0x(nil)) -> 14 14 64 1, 12544, 0x0x343a0e0(0x0x343a0e0, 0x(nil))) k(1 1 512, 34688) pad(0 0) pool(1 1, 1 1)] P[ 41] C[ 51] 43 NN [( 14 14 512 1, 100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) -> 14 14 160 1, 31360, 0x0x3423090(0x0x3423090, 0x(nil))) k(1 1 512, 86784) pad(0 0) pool(1 1, 1 1)] P[ 37, 38, 39, 40] C[ 48] 44 NN [( 14 14 512 1, 100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) -> 14 14 112 1, 21952, 0x0x3427220(0x0x3427220, 0x(nil))) k(1 1 512, 60800) pad(0 0) pool(1 1, 1 1)] P[ 37, 38, 39, 40] C[ 45] 45 NN [( 14 14 112 1, 21952, 0x0x3427220(0x0x3427220, 0x(nil)) -> 14 14 224 1, 43904, 0x0x3431bd0(0x0x3431bd0, 0x(nil))) k(3 3 112, 238080) pad(1 1) pool(1 1, 1 1)] P[ 44] C[ 49] 46 NN [( 14 14 512 1, 100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) -> 14 14 24 1, 4704, 0x0x342b180(0x0x342b180, 0x(nil))) k(1 1 512, 13056) pad(0 0) pool(1 1, 1 1)] P[ 37, 38, 39, 40] C[ 47] 47 NN [( 14 14 24 1, 4704, 0x0x342b180(0x0x342b180, 0x(nil)) -> 14 14 64 1, 12544, 0x0x3436150(0x0x3436150, 0x(nil))) k(3 3 24, 14848) pad(1 1) pool(1 1, 1 1)] P[ 46] C[ 50] 48 TP [( 14 14 160 1, 31360, 0x0x3423090(0x0x3423090, 0x(nil)) -> 14 14 160 1, 100352, 0x0x343e020(0x0x343e020, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 43] C[ 52, 54, 55, 57] 49 TP [( 14 14 224 1, 43904, 0x0x3431bd0(0x0x3431bd0, 0x(nil)) -> 14 14 224 1, 100352, 0x0x6602020(0x0x343e020, 0x0x7a80)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 45] C[ 52, 54, 55, 57] 50 TP [( 14 14 64 1, 12544, 0x0x3436150(0x0x3436150, 0x(nil)) -> 14 14 64 1, 100352, 0x0xabae020(0x0x343e020, 0x0x12600)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 47] C[ 52, 54, 55, 57] 51 TP [( 14 14 64 1, 12544, 0x0x343a0e0(0x0x343a0e0, 0x(nil)) -> 14 14 64 1, 100352, 0x0xbf96020(0x0x343e020, 0x0x15700)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 42] C[ 52, 54, 55, 57] 52 TP [( 14 14 512 1, 100352, 0x0x343e020(0x0x343e020, 0x(nil)) -> 14 14 512 1, 100352, 0x0x353d330(0x0x353d330, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 48, 49, 50, 51] C[ 53] 53 NN [( 14 14 512 1, 100352, 0x0x353d330(0x0x353d330, 0x(nil)) -> 14 14 64 1, 12544, 0x0x353fdc0(0x0x353fdc0, 0x(nil))) k(1 1 512, 34688) pad(0 0) pool(1 1, 1 1)] P[ 52] C[ 62] 54 NN [( 14 14 512 1, 100352, 0x0x343e020(0x0x343e020, 0x(nil)) -> 14 14 128 1, 25088, 0x0x344f1d0(0x0x344f1d0, 0x(nil))) k(1 1 512, 69376) pad(0 0) pool(1 1, 1 1)] P[ 48, 49, 50, 51] C[ 59] 55 NN [( 14 14 512 1, 100352, 0x0x343e020(0x0x343e020, 0x(nil)) -> 14 14 128 1, 25088, 0x0x3453360(0x0x3453360, 0x(nil))) k(1 1 512, 69376) pad(0 0) pool(1 1, 1 1)] P[ 48, 49, 50, 51] C[ 56] 56 NN [( 14 14 128 1, 25088, 0x0x3453360(0x0x3453360, 0x(nil)) -> 14 14 256 1, 50176, 0x0x3539400(0x0x3539400, 0x(nil))) k(3 3 128, 310784) pad(1 1) pool(1 1, 1 1)] P[ 55] C[ 60] 57 NN [( 14 14 512 1, 100352, 0x0x343e020(0x0x343e020, 0x(nil)) -> 14 14 24 1, 4704, 0x0x35354a0(0x0x35354a0, 0x(nil))) k(1 1 512, 13056) pad(0 0) pool(1 1, 1 1)] P[ 48, 49, 50, 51] C[ 58] 58 NN [( 14 14 24 1, 4704, 0x0x35354a0(0x0x35354a0, 0x(nil)) -> 14 14 64 1, 12544, 0x0x35443d0(0x0x35443d0, 0x(nil))) k(3 3 24, 14848) pad(1 1) pool(1 1, 1 1)] P[ 57] C[ 61] 59 TP [( 14 14 128 1, 25088, 0x0x344f1d0(0x0x344f1d0, 0x(nil)) -> 14 14 128 1, 100352, 0x0x3548340(0x0x3548340, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 54] C[ 63, 65, 66, 68] 60 TP [( 14 14 256 1, 50176, 0x0x3539400(0x0x3539400, 0x(nil)) -> 14 14 256 1, 100352, 0x0x5d18340(0x0x3548340, 0x0x6200)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 56] C[ 63, 65, 66, 68] 61 TP [( 14 14 64 1, 12544, 0x0x35443d0(0x0x35443d0, 0x(nil)) -> 14 14 64 1, 100352, 0x0xacb8340(0x0x3548340, 0x0x12600)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 58] C[ 63, 65, 66, 68] 62 TP [( 14 14 64 1, 12544, 0x0x353fdc0(0x0x353fdc0, 0x(nil)) -> 14 14 64 1, 100352, 0x0xc0a0340(0x0x3548340, 0x0x15700)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 53] C[ 63, 65, 66, 68] 63 TP [( 14 14 512 1, 100352, 0x0x3548340(0x0x3548340, 0x(nil)) -> 14 14 512 1, 100352, 0x0x3559410(0x0x3559410, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 59, 60, 61, 62] C[ 64] 64 NN [( 14 14 512 1, 100352, 0x0x3559410(0x0x3559410, 0x(nil)) -> 14 14 64 1, 12544, 0x0x3568670(0x0x3568670, 0x(nil))) k(1 1 512, 34688) pad(0 0) pool(1 1, 1 1)] P[ 63] C[ 73] 65 NN [( 14 14 512 1, 100352, 0x0x3548340(0x0x3548340, 0x(nil)) -> 14 14 112 1, 21952, 0x0x355c190(0x0x355c190, 0x(nil))) k(1 1 512, 60800) pad(0 0) pool(1 1, 1 1)] P[ 59, 60, 61, 62] C[ 70] 66 NN [( 14 14 512 1, 100352, 0x0x3548340(0x0x3548340, 0x(nil)) -> 14 14 144 1, 28224, 0x0x35607a0(0x0x35607a0, 0x(nil))) k(1 1 512, 78080) pad(0 0) pool(1 1, 1 1)] P[ 59, 60, 61, 62] C[ 67] 67 NN [( 14 14 144 1, 28224, 0x0x35607a0(0x0x35607a0, 0x(nil)) -> 14 14 288 1, 56448, 0x0x356c5d0(0x0x356c5d0, 0x(nil))) k(3 3 144, 393216) pad(1 1) pool(1 1, 1 1)] P[ 66] C[ 71] 68 NN [( 14 14 512 1, 100352, 0x0x3548340(0x0x3548340, 0x(nil)) -> 14 14 32 1, 6272, 0x0x3564710(0x0x3564710, 0x(nil))) k(1 1 512, 17408) pad(0 0) pool(1 1, 1 1)] P[ 59, 60, 61, 62] C[ 69] 69 NN [( 14 14 32 1, 6272, 0x0x3564710(0x0x3564710, 0x(nil)) -> 14 14 64 1, 12544, 0x0x3570500(0x0x3570500, 0x(nil))) k(3 3 32, 19712) pad(1 1) pool(1 1, 1 1)] P[ 68] C[ 72] 70 TP [( 14 14 112 1, 21952, 0x0x355c190(0x0x355c190, 0x(nil)) -> 14 14 112 1, 103488, 0x0x3574460(0x0x3574460, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 65] C[ 74, 76, 77, 79] 71 TP [( 14 14 288 1, 56448, 0x0x356c5d0(0x0x356c5d0, 0x(nil)) -> 14 14 288 1, 103488, 0x0x584a460(0x0x3574460, 0x0x55c0)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 67] C[ 74, 76, 77, 79] 72 TP [( 14 14 64 1, 12544, 0x0x3570500(0x0x3570500, 0x(nil)) -> 14 14 64 1, 103488, 0x0xb1de460(0x0x3574460, 0x0x13240)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 69] C[ 74, 76, 77, 79] 73 TP [( 14 14 64 1, 12544, 0x0x3568670(0x0x3568670, 0x(nil)) -> 14 14 64 1, 103488, 0x0xc5c6460(0x0x3574460, 0x0x16340)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 64] C[ 74, 76, 77, 79] 74 TP [( 14 14 528 1, 103488, 0x0x3574460(0x0x3574460, 0x(nil)) -> 14 14 528 1, 103488, 0x0x3585530(0x0x3585530, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 70, 71, 72, 73] C[ 75] 75 NN [( 14 14 528 1, 103488, 0x0x3585530(0x0x3585530, 0x(nil)) -> 14 14 128 1, 25088, 0x0x35947b0(0x0x35947b0, 0x(nil))) k(1 1 528, 71552) pad(0 0) pool(1 1, 1 1)] P[ 74] C[ 84] 76 NN [( 14 14 528 1, 103488, 0x0x3574460(0x0x3574460, 0x(nil)) -> 14 14 256 1, 50176, 0x0x35882d0(0x0x35882d0, 0x(nil))) k(1 1 528, 143104) pad(0 0) pool(1 1, 1 1)] P[ 70, 71, 72, 73] C[ 81] 77 NN [( 14 14 528 1, 103488, 0x0x3574460(0x0x3574460, 0x(nil)) -> 14 14 160 1, 31360, 0x0x358c8e0(0x0x358c8e0, 0x(nil))) k(1 1 528, 89472) pad(0 0) pool(1 1, 1 1)] P[ 70, 71, 72, 73] C[ 78] 78 NN [( 14 14 160 1, 31360, 0x0x358c8e0(0x0x358c8e0, 0x(nil)) -> 14 14 320 1, 62720, 0x0x3598710(0x0x3598710, 0x(nil))) k(3 3 160, 485248) pad(1 1) pool(1 1, 1 1)] P[ 77] C[ 82] 79 NN [( 14 14 528 1, 103488, 0x0x3574460(0x0x3574460, 0x(nil)) -> 14 14 32 1, 6272, 0x0x3590850(0x0x3590850, 0x(nil))) k(1 1 528, 17920) pad(0 0) pool(1 1, 1 1)] P[ 70, 71, 72, 73] C[ 80] 80 NN [( 14 14 32 1, 6272, 0x0x3590850(0x0x3590850, 0x(nil)) -> 14 14 128 1, 25088, 0x0x359c640(0x0x359c640, 0x(nil))) k(3 3 32, 39296) pad(1 1) pool(1 1, 1 1)] P[ 79] C[ 83] 81 TP [( 14 14 256 1, 50176, 0x0x35882d0(0x0x35882d0, 0x(nil)) -> 14 14 256 1, 163072, 0x0x35a05a0(0x0x35a05a0, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 76] C[ 85] 82 TP [( 14 14 320 1, 62720, 0x0x3598710(0x0x3598710, 0x(nil)) -> 14 14 320 1, 163072, 0x0x85405a0(0x0x35a05a0, 0x0xc400)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 78] C[ 85] 83 TP [( 14 14 128 1, 25088, 0x0x359c640(0x0x359c640, 0x(nil)) -> 14 14 128 1, 163072, 0x0xe8c85a0(0x0x35a05a0, 0x0x1b900)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 80] C[ 85] 84 TP [( 14 14 128 1, 25088, 0x0x35947b0(0x0x35947b0, 0x(nil)) -> 14 14 128 1, 163072, 0x0x110985a0(0x0x35a05a0, 0x0x21b00)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 75] C[ 85] 85 TP [( 14 14 832 1, 163072, 0x0x35a05a0(0x0x35a05a0, 0x(nil)) -> 7 7 832 1, 40768, 0x0x35b1670(0x0x35b1670, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(2 2, 2 2)] P[ 81, 82, 83, 84] C[ 86, 88, 89, 91] 86 TP [( 7 7 832 1, 40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) -> 7 7 832 1, 40768, 0x0x35b4410(0x0x35b4410, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 85] C[ 87] 87 NN [( 7 7 832 1, 40768, 0x0x35b4410(0x0x35b4410, 0x(nil)) -> 7 7 128 1, 6272, 0x0x35c3a20(0x0x35c3a20, 0x(nil))) k(1 1 832, 112384) pad(0 0) pool(1 1, 1 1)] P[ 86] C[ 96] 88 NN [( 7 7 832 1, 40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) -> 7 7 256 1, 12544, 0x0x35b7540(0x0x35b7540, 0x(nil))) k(1 1 832, 224768) pad(0 0) pool(1 1, 1 1)] P[ 85] C[ 93] 89 NN [( 7 7 832 1, 40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) -> 7 7 160 1, 7840, 0x0x35bbb50(0x0x35bbb50, 0x(nil))) k(1 1 832, 140544) pad(0 0) pool(1 1, 1 1)] P[ 85] C[ 90] 90 NN [( 7 7 160 1, 7840, 0x0x35bbb50(0x0x35bbb50, 0x(nil)) -> 7 7 320 1, 15680, 0x0x35c7980(0x0x35c7980, 0x(nil))) k(3 3 160, 485248) pad(1 1) pool(1 1, 1 1)] P[ 89] C[ 94] 91 NN [( 7 7 832 1, 40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) -> 7 7 32 1, 1568, 0x0x35bfac0(0x0x35bfac0, 0x(nil))) k(1 1 832, 28160) pad(0 0) pool(1 1, 1 1)] P[ 85] C[ 92] 92 NN [( 7 7 32 1, 1568, 0x0x35bfac0(0x0x35bfac0, 0x(nil)) -> 7 7 128 1, 6272, 0x0x35cb8b0(0x0x35cb8b0, 0x(nil))) k(3 3 32, 39296) pad(1 1) pool(1 1, 1 1)] P[ 91] C[ 95] 93 TP [( 7 7 256 1, 12544, 0x0x35b7540(0x0x35b7540, 0x(nil)) -> 7 7 256 1, 40768, 0x0x35cf810(0x0x35cf810, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 88] C[ 97, 99,100,102] 94 TP [( 7 7 320 1, 15680, 0x0x35c7980(0x0x35c7980, 0x(nil)) -> 7 7 320 1, 40768, 0x0x49b7810(0x0x35cf810, 0x0x3100)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 90] C[ 97, 99,100,102] 95 TP [( 7 7 128 1, 6272, 0x0x35cb8b0(0x0x35cb8b0, 0x(nil)) -> 7 7 128 1, 40768, 0x0x6299810(0x0x35cf810, 0x0x6e40)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 92] C[ 97, 99,100,102] 96 TP [( 7 7 128 1, 6272, 0x0x35c3a20(0x0x35c3a20, 0x(nil)) -> 7 7 128 1, 40768, 0x0x6c8d810(0x0x35cf810, 0x0x86c0)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 87] C[ 97, 99,100,102] 97 TP [( 7 7 832 1, 40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) -> 7 7 832 1, 40768, 0x0x35e08e0(0x0x35e08e0, 0x(nil))) k(0 0 0, 0) pad(1 1) pool(3 3, 1 1)] P[ 93, 94, 95, 96] C[ 98] 98 NN [( 7 7 832 1, 40768, 0x0x35e08e0(0x0x35e08e0, 0x(nil)) -> 7 7 128 1, 6272, 0x0x35efb60(0x0x35efb60, 0x(nil))) k(1 1 832, 112384) pad(0 0) pool(1 1, 1 1)] P[ 97] C[107] 99 NN [( 7 7 832 1, 40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) -> 7 7 384 1, 18816, 0x0x35e3680(0x0x35e3680, 0x(nil))) k(1 1 832, 337152) pad(0 0) pool(1 1, 1 1)] P[ 93, 94, 95, 96] C[104] 100 NN [( 7 7 832 1, 40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) -> 7 7 192 1, 9408, 0x0x35e7c90(0x0x35e7c90, 0x(nil))) k(1 1 832, 168576) pad(0 0) pool(1 1, 1 1)] P[ 93, 94, 95, 96] C[101] 101 NN [( 7 7 192 1, 9408, 0x0x35e7c90(0x0x35e7c90, 0x(nil)) -> 7 7 384 1, 18816, 0x0x35f3ac0(0x0x35f3ac0, 0x(nil))) k(3 3 192, 698368) pad(1 1) pool(1 1, 1 1)] P[100] C[105] 102 NN [( 7 7 832 1, 40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) -> 7 7 48 1, 2352, 0x0x35ebc00(0x0x35ebc00, 0x(nil))) k(1 1 832, 42240) pad(0 0) pool(1 1, 1 1)] P[ 93, 94, 95, 96] C[103] 103 NN [( 7 7 48 1, 2352, 0x0x35ebc00(0x0x35ebc00, 0x(nil)) -> 7 7 128 1, 6272, 0x0x35f79f0(0x0x35f79f0, 0x(nil))) k(3 3 48, 58624) pad(1 1) pool(1 1, 1 1)] P[102] C[106] 104 TP [( 7 7 384 1, 18816, 0x0x35e3680(0x0x35e3680, 0x(nil)) -> 7 7 384 1, 50176, 0x0x35fb950(0x0x35fb950, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 99] C[108] 105 TP [( 7 7 384 1, 18816, 0x0x35f3ac0(0x0x35f3ac0, 0x(nil)) -> 7 7 384 1, 50176, 0x0x53d7950(0x0x35fb950, 0x0x4980)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[101] C[108] 106 TP [( 7 7 128 1, 6272, 0x0x35f79f0(0x0x35f79f0, 0x(nil)) -> 7 7 128 1, 50176, 0x0x71b3950(0x0x35fb950, 0x0x9300)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[103] C[108] 107 TP [( 7 7 128 1, 6272, 0x0x35efb60(0x0x35efb60, 0x(nil)) -> 7 7 128 1, 50176, 0x0x7ba7950(0x0x35fb950, 0x0xab80)) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[ 98] C[108] 108 SH [( 7 7 1024 1, 50176, 0x0x35fb950(0x0x35fb950, 0x(nil)) -> 1 1 1024 1, 1024, 0x0x360ca20(0x0x360ca20, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[104,105,106,107] C[109] 109 TP [(1024 1 1 1, 1024, 0x0x360ca20(0x0x360ca20, 0x(nil)) -> 1001 1 1 1, 1001, 0x0x338fa70(0x0x338fa70, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(1 1, 1 1)] P[108] C[110] 110 SH [(1001 1 1 1, 1001, 0x0x338fa70(0x0x338fa70, 0x(nil)) -> 1001 1 1 1, 1001, 0x0x338ee30(0x0x338ee30, 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 1 1)] P[109] id IN [ x y w h ] OUT [ x y w h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out) id | opid IN [ x y w h ] OUT [ x y w h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out) 0 | 0 TP DD 0x0 [ 0 0 3 224] -> DD 0x0 [ 0 0 224 224] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 1 | 1 TP DD 0x0 [ 0 0 224 224] -> DD 0x0 [ 0 0 115 115] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 2 | 2 NN DD 0x0 [ 0 0 115 115] -> DD 0x0 [ 0 0 112 112] ( 56, 8, 8) ( 7936, 11776, 100.00%, 87.62%, DD) ( 0, 0) 3 | 3 TP DD 0x0 [ 0 0 112 112] -> DD 0x0 [ 0 0 56 56] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 4 | 4 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 8) ( 28672, 4608, 100.00%, 100.00%, DD) ( 0, 0) 5 | 5 NN DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 56 56] ( 56, 8, 8) ( 37888, 112640, 100.00%, 96.28%, DD) ( 0, 0) 6 | 6 TP DD 0x0 [ 0 0 56 56] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 7 | 7 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 8 | 8 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 4) ( 76800, 6656, 100.00%, 100.00%, DD) ( 0, 0) 9 | 9 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 76800, 13312, 100.00%, 100.97%, DD) ( 0, 0) 10 | 10 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 10, 12) ( 55296, 19456, 100.00%, 98.06%, DD) ( 0, 0) 11 | 11 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 16, 8) ( 52224, 112128, 100.00%, 96.05%, DD) ( 0, 0) 12 | 12 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 2) ( 76800, 3584, 100.00%, 107.69%, DD) ( 0, 0) 13 | 13 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 4) ( 7680, 5120, 100.00%, 102.56%, DD) ( 0, 0) 14 | 14 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 15 | 15 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 16 | 16 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 17 | 17 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 18 | 18 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 19 | 19 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 102400, 17408, 100.00%, 99.27%, DD) ( 0, 0) 20 | 20 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 102400, 34304, 100.00%, 98.17%, DD) ( 0, 0) 21 | 21 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 8) ( 102400, 34304, 100.00%, 98.17%, DD) ( 0, 0) 22 | 22 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 20, 6) ( 86016, 223232, 100.00%, 95.77%, DD) ( 0, 0) 23 | 23 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 14, 4) ( 102400, 8704, 100.00%, 98.55%, DD) ( 0, 0) 24 | 24 NN DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 28, 10, 12) ( 11776, 28672, 100.00%, 97.39%, DD) ( 0, 0) 25 | 25 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 26 | 26 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 27 | 27 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 28 | 28 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 28 28] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 29 | 29 TP DD 0x0 [ 0 0 28 28] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 30 | 30 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 31 | 31 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 99840, 31744, 100.00%, 97.25%, DD) ( 0, 0) 32 | 32 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 12) ( 99840, 94208, 100.00%, 96.46%, DD) ( 0, 0) 33 | 33 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 12) ( 99840, 47104, 100.00%, 96.34%, DD) ( 0, 0) 34 | 34 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 13) ( 24576, 181760, 100.00%, 95.82%, DD) ( 0, 0) 35 | 35 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 2) ( 99840, 8192, 100.00%, 100.00%, DD) ( 0, 0) 36 | 36 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 6) ( 4096, 7680, 100.00%, 101.69%, DD) ( 0, 0) 37 | 37 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 38 | 38 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 39 | 39 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 40 | 40 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 41 | 41 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 42 | 42 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 106496, 33792, 100.00%, 97.42%, DD) ( 0, 0) 43 | 43 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 10) ( 106496, 83456, 100.00%, 96.17%, DD) ( 0, 0) 44 | 44 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 14) ( 106496, 58368, 100.00%, 96.00%, DD) ( 0, 0) 45 | 45 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 14) ( 28672, 227840, 100.00%, 95.70%, DD) ( 0, 0) 46 | 46 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 3) ( 106496, 12800, 100.00%, 98.04%, DD) ( 0, 0) 47 | 47 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 6144, 14848, 100.00%, 100.00%, DD) ( 0, 0) 48 | 48 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 49 | 49 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 50 | 50 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 51 | 51 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 52 | 52 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 53 | 53 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 106496, 33792, 100.00%, 97.42%, DD) ( 0, 0) 54 | 54 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 106496, 67072, 100.00%, 96.68%, DD) ( 0, 0) 55 | 55 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 106496, 67072, 100.00%, 96.68%, DD) ( 0, 0) 56 | 56 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 32768, 297472, 100.00%, 95.72%, DD) ( 0, 0) 57 | 57 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 3) ( 106496, 12800, 100.00%, 98.04%, DD) ( 0, 0) 58 | 58 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 6144, 14848, 100.00%, 100.00%, DD) ( 0, 0) 59 | 59 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 60 | 60 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 61 | 61 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 62 | 62 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 63 | 63 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 64 | 64 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 106496, 33792, 100.00%, 97.42%, DD) ( 0, 0) 65 | 65 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 14) ( 106496, 58368, 100.00%, 96.00%, DD) ( 0, 0) 66 | 66 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 106496, 75264, 100.00%, 96.39%, DD) ( 0, 0) 67 | 67 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 18) ( 36864, 375808, 100.00%, 95.57%, DD) ( 0, 0) 68 | 68 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 4) ( 106496, 16896, 100.00%, 97.06%, DD) ( 0, 0) 69 | 69 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 8) ( 8192, 19456, 100.00%, 98.70%, DD) ( 0, 0) 70 | 70 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 71 | 71 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 72 | 72 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 73 | 73 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 74 | 74 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 75 | 75 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 109824, 69120, 100.00%, 96.60%, DD) ( 0, 0) 76 | 76 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 109824, 137728, 100.00%, 96.24%, DD) ( 0, 0) 77 | 77 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 10) ( 109824, 86016, 100.00%, 96.14%, DD) ( 0, 0) 78 | 78 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 14) ( 40960, 463872, 100.00%, 95.59%, DD) ( 0, 0) 79 | 79 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 4) ( 109824, 17408, 100.00%, 97.14%, DD) ( 0, 0) 80 | 80 NN DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 14, 14, 16) ( 8192, 38400, 100.00%, 97.72%, DD) ( 0, 0) 81 | 81 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 82 | 82 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 83 | 83 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 84 | 84 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 14 14] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 85 | 85 TP DD 0x0 [ 0 0 14 14] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 86 | 86 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 87 | 87 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 4, 16) ( 26624, 108032, 100.00%, 96.13%, DD) ( 0, 0) 88 | 88 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 32) ( 53248, 215552, 100.00%, 95.90%, DD) ( 0, 0) 89 | 89 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 20) ( 53248, 134656, 100.00%, 95.81%, DD) ( 0, 0) 90 | 90 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 20) ( 15360, 463872, 100.00%, 95.59%, DD) ( 0, 0) 91 | 91 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 4, 4) ( 26624, 27136, 100.00%, 96.36%, DD) ( 0, 0) 92 | 92 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 16) ( 3072, 38400, 100.00%, 97.72%, DD) ( 0, 0) 93 | 93 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 94 | 94 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 95 | 95 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 96 | 96 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 97 | 97 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 98 | 98 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 4, 16) ( 26624, 108032, 100.00%, 96.13%, DD) ( 0, 0) 99 | 99 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 24) ( 53248, 323072, 100.00%, 95.82%, DD) ( 0, 0) 100 | 100 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 24) ( 53248, 161792, 100.00%, 95.98%, DD) ( 0, 0) 101 | 101 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 24) ( 18432, 503808, 75.52%, 95.53%, DD) ( 0, 0) 102 | 102 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 4, 6) ( 26624, 40448, 100.00%, 95.76%, DD) ( 0, 0) 103 | 103 NN DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 7, 7, 16) ( 4608, 56832, 100.00%, 96.94%, DD) ( 0, 0) 104 | 104 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 105 | 105 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 106 | 106 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 107 | 107 TP DD 0x0 [ 0 0 7 7] -> DD 0x0 [ 0 0 7 7] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 108 | 108 SH DD 0x0 [ 0 0 0 0] -> DD 0x0 [ 0 0 0 0] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 109 | 109 TP DD 0x0 [ 0 0 1024 1] -> DD 0x0 [ 0 0 1001 1] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) 110 | 110 SH DD 0x0 [ 0 0 0 0] -> DD 0x0 [ 0 0 0 0] ( 0, 0, 0) ( 0, 0, 0.00%, 0.00%, NONE) ( 0, 0) PreLoadWeightBiases = 1048320 100.000000% ---------------------------End VerifyTiling ------------------------- KernelStreamSize: 0x66b540, statesSize: 0x4bc0, shShareMemSize: 0x0, shIntrSize: 0x700, shParaSize: 0x440, swParaSize: 0x0, lcdTensorSize: 0x0, shaderStatesSize: 0x9c0, tensorStatic: 0x0 NBG: operationSize: 0x1d50, nnSize: 0x1d00, tpSize: 0x9a80, shSize: 0x10, swSize: 0x0, layerParamSize: 0x0, lcdtSize: 0xa78, patchSize: 0xb478, icdtSize: 0xe8 hwInitOpSize: 0x24, lcdSize 0x670d40 NBG: entranceSize: 0x208, nbIOSize: 0xe8, layeSize: 0x18a4, sectionsSize: 0x194dc, inputoutput size: 0x24fe9, InitCommands size: 0x1104 NBG: lcdSize: 0x670d40, headerSize : 0x1b070 Calculate NBG size : 6868916 bytes generate NBG into memory start. vxoBinaryGraph_SaveBinaryEntrance[20461]: collect input count=0, output count=0 vxoBinaryGraph_SaveBinaryEntrance[20531]: total operation count=111 generate NBG, device count=1, core count per-device: 1, vxoBinaryGraph_RefineInputOutput:11143 input table address: 0x183c500 vxoBinaryGraph_RefineInputOutput:11149 output table address: 0x1ec0100 vxoBinaryGraph_SaveBinaryEntranceExt[19524]: graph->inputCount=1, graph->outputCount=1, refine inputCount=1, outputCount=1 NBG network name field : dummy_network_name vxoBinaryGraph_SaveBinaryEntranceExt[20127]: header input count=1, output count=1 generate NBG, save initialize commands vxoBinaryGraph_ReSaveInputAndPatchTable[17202]: re-save operation count=265 Generate NBG in memory Actual NBG size : 6862080 bytes generate NBG into memory successfully. Releasing object array 0x33ba640 Releasing object array 0x33e42c0 Releasing object array 0x3412550 Releasing object array 0x343e6b0 Releasing object array 0x35489d0 Releasing object array 0x3574af0 Releasing object array 0x35a0c30 Releasing object array 0x35cfea0 Releasing object array 0x35fbfe0 VsiNpuModule::GetFunction: get_symbol VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: get_const_vars VsiNpuModule::GetFunction: return early source_filename = "empty_module" target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" target triple = "aarch64-linux-gnu" VsiNpuModule::SaveToBinary SaveToBinary: nbg size = 6862080 SaveToBinary: input size = 1 SaveToBinary: output size = 1 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule : SerializeTensorSpec VsiNpuModule : SerializeTensorSpec2 VsiNpuModule::SaveToBinary2 ['aarch64-linux-gnu-g++', '-shared', '-fPIC', '-o', 'lib.so', '/tmp/tmp0u18x4nw/lib0.o', '/tmp/tmp0u18x4nw/devc.o', '-L/home/niuniu/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu/bin'] ============ [[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]] ```
Khadas VIM3 pro
``` python3 -m tvm.exec.rpc_server --host 0.0.0.0 --port=9090 INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork``` INFO:RPCServer:bind to 0.0.0.0:9090 INFO:RPCServer:connection from ('192.168.137.177', 41982) VsiNpuModule::LoadFromBinary LoadFromBinary: nbg size = 6862080 LoadFromBinary: input size = 1 LoadFromBinary: output size = 1 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 VsiNpuModule : DeSerializeTensorSpec VsiNpuModule : DeSerializeTensorSpec2 INFO:RPCServer:load_module /tmp/tmp2vjrwe64/lib.so VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: _lookup_linked_param VsiNpuModule::GetFunction: return early VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0 [ 1] PLS isn't existed Process Graph: 61439 ms or 61439858 us VsiNpuModule::GetFunction: size: 2 INFO:RPCServer:Finish serving ('192.168.137.177', 41982) ```