kendryte / nncase

Open deep learning compiler stack for Kendryte AI accelerators ✨
Apache License 2.0
748 stars 181 forks source link

nncase2.2版本yolov5-7.0的s模型 onnx转kmodel失败 #1054

Closed wozwdaqian closed 1 year ago

wozwdaqian commented 1 year ago

Describe the bug nncase2.2版本yolov5-7.0的s模型 onnx转kmodel失败

To Reproduce 运行下面代码会出现这个错误

/root/k230/k230_sdk-main/src/big/nncase/my_example/onnx2kmodel.py:23: DeprecationWarning: `mapping.TENSOR_TYPE_TO_NP_TYPE` is now deprecated and will be removed in a future release.To silence this warning, please use `helper.tensor_dtype_to_np_dtype` instead.
  input_dict['dtype'] = onnx.mapping.TENSOR_TYPE_TO_NP_TYPE[onnx_type.elem_type]
WARNING: The argument `input_shapes` is deprecated. Please use `overwrite_input_shapes` and/or `test_input_shapes` instead. An error will be raised in the future.
Unhandled exception. System.AggregateException: One or more errors occurred. (Number was less than the array's lower bound in the first dimension. (Parameter 'destinationIndex'))
 ---> System.ArgumentOutOfRangeException: Number was less than the array's lower bound in the first dimension. (Parameter 'destinationIndex')
   at System.Array.Copy(Array sourceArray, Int32 sourceIndex, Array destinationArray, Int32 destinationIndex, Int32 length, Boolean reliable)
   at Nncase.Passes.Rules.K230.BinaryToFakeActivation.GetReplace(Binary bn, Call call, Expr lhs, Expr rhs, Tensor`1 inputRangeL, Expr inputMarkerL, Tensor`1 inputRangeR, Expr inputMarkerR, Tensor`1 outputRange, Expr outputMarker)
   at Nncase.Passes.Rules.K230.BinaryToFakeActivation.GetReplace(IMatchResult __result, RunPassContext __context)
   at Nncase.Passes.DataFlowRewriter.DefaultRewriteLeaf(Expr expr)
   at Nncase.IR.ExprRewriter.RewriteLeafMarker(Marker expr)
   at Nncase.IR.ExprRewriter.RewriteLeafMarker(Marker expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitLeafMarker(Marker expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitMarker(Marker expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitCall(Call expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitMarker(Marker expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitCall(Call expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitMarker(Marker expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitTuple(Tuple expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitCall(Call expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitCall(Call expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitTuple(Tuple expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitCall(Call expr, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.VisitOperands(Expr expr, TContext context)
   at Nncase.IR.ExprVisitor`3.VisitFunction(Function expr, TContext context)
   at Nncase.IR.Function.Accept[TExprResult,TTypeResult,TContext](ExprFunctor`3 functor, TContext context)
   at Nncase.IR.ExprVisitor`3.DispatchVisit(Expr expr, TContext context)
   at Nncase.Passes.DataFlowRewriter.DispatchVisit(Expr expr, Unit context)
   at Nncase.IR.ExprRewriter`1.Rewrite(Expr expr, TContext context)
   at Nncase.IR.ExprRewriter.Rewrite(Expr expr)
   at Nncase.Passes.RewriteProvider.Rewrite(Expr expr, IEnumerable`1 rules, RunPassContext context)
   at Nncase.CompilerServicesProvider.Rewrite(Expr expr, IEnumerable`1 rules, RunPassContext options)
   at Nncase.CompilerServices.Rewrite(Expr expr, IEnumerable`1 rules, RunPassContext options)
   at Nncase.Passes.DataflowPass.RunCoreAsync(BaseFunction function, RunPassContext options)
   at Nncase.Passes.Pass`2.RunAsync(TInput input, RunPassContext context)
   at Nncase.Passes.PassManager.FunctionPassGroup.Runner.RunAsync()
   at Nncase.Passes.PassManager.FunctionPassGroup.RunAsync(IRModule module)
   at Nncase.Passes.PassManager.RunAsync(IRModule module)
   at Nncase.Compiler.Compiler.RunPassAsync(Action`1 register, String name)
   at Nncase.Compiler.Compiler.CompileAsync()
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at Nncase.Compiler.Interop.CApi.CompilerCompile(IntPtr compilerHandle)
Aborted

Expected behavior 成功转模型

Origin model and code 这是我用于转模型的代码,k230_sdk中提供的yolov5s.onnx是可以正常转的,但是官方7.0版本不行,我测试了不进行量化也是可以转模型成功。

import os
import argparse
import numpy as np
from PIL import Image
import onnxsim
import onnx
import nncase

def parse_model_input_output(model_file):
    onnx_model = onnx.load(model_file)
    input_all = [node.name for node in onnx_model.graph.input]
    input_initializer = [node.name for node in onnx_model.graph.initializer]
    input_names = list(set(input_all) - set(input_initializer))
    input_tensors = [
        node for node in onnx_model.graph.input if node.name in input_names]

    # input
    inputs = []
    for _, e in enumerate(input_tensors):
        onnx_type = e.type.tensor_type
        input_dict = {}
        input_dict['name'] = e.name
        input_dict['dtype'] = onnx.mapping.TENSOR_TYPE_TO_NP_TYPE[onnx_type.elem_type]
        input_dict['shape'] = [(i.dim_value if i.dim_value != 0 else d) for i, d in zip(
            onnx_type.shape.dim, [1, 3, 224, 224])]
        inputs.append(input_dict)

    return onnx_model, inputs

def onnx_simplify(model_file, dump_dir):
    onnx_model, inputs = parse_model_input_output(model_file)
    onnx_model = onnx.shape_inference.infer_shapes(onnx_model)
    input_shapes = {}
    for input in inputs:
        input_shapes[input['name']] = input['shape']

    onnx_model, check = onnxsim.simplify(onnx_model, input_shapes=input_shapes)
    assert check, "Simplified ONNX model could not be validated"

    model_file = os.path.join(dump_dir, 'simplified.onnx')
    onnx.save_model(onnx_model, model_file)
    return model_file

def read_model_file(model_file):
    with open(model_file, 'rb') as f:
        model_content = f.read()
    return model_content

def generate_data_ramdom(shape, batch):
    data = []
    for i in range(batch):
        data.append([np.random.randint(0, 256, shape).astype(np.uint8)])
    return data

def generate_data(shape, batch, calib_dir):
    img_paths = [os.path.join(calib_dir, p) for p in os.listdir(calib_dir)]
    data = []
    for i in range(batch):
        assert i < len(img_paths), "calibration images not enough."
        img_data = Image.open(img_paths[i]).convert('RGB')
        img_data = img_data.resize((shape[3], shape[2]), Image.BILINEAR)
        img_data = np.asarray(img_data, dtype=np.uint8)
        img_data = np.transpose(img_data, (2, 0, 1))
        data.append([img_data[np.newaxis, ...]])
    return data

def main():
    parser = argparse.ArgumentParser(prog="nncase")
    parser.add_argument("--target", type=str, default="k230", help='要部署的平台k230')
    parser.add_argument("--model", type=str, default="yolov5/yolov5s.onnx", help='model file')
    parser.add_argument("--dataset", type=str, default="calibration_dataset", help='calibration_dataset')

    args = parser.parse_args()

    input_shape = [1, 3, 160, 160]

    # 中间输出
    dump_dir = 'tmp/yolov5s_onnx'
    if not os.path.exists(dump_dir):
        os.makedirs(dump_dir)

    # onnx simplify
    model_file = onnx_simplify(args.model, dump_dir)

    # compile_options
    compile_options = nncase.CompileOptions()
    compile_options.target = args.target                                            # 指定编译目标, 如'k210', 'k510', ‘k230’
    compile_options.preprocess = True                                               # 是否开启前处理,默认为False
    compile_options.swapRB = False                                                  # 否交换RGB输入数据的红和蓝两个通道(RGB-->BGR或者BGR-->RGB),默认为False
    compile_options.input_shape = input_shape                                       # 指定输入数据的shape,input_shape的layout需要与input layout保持一致,输入数据的input_shape与模型的input shape不一致时会进行letterbox操作(resize/pad等)
    compile_options.input_type = 'uint8'                                            # 指定输入数据的类型, 默认为'float32'
    compile_options.input_range = [0, 1]                                            # 输入数据反量化后对应浮点数的范围,默认为[0,1]
    compile_options.mean = [0, 0, 0]                                                # 前处理标准化参数均值
    compile_options.std = [1, 1, 1]                                                 # 前处理标准化参数方差
    compile_options.input_layout = 'NCHW'                                           # 定输入数据的layout,
    compile_options.output_layout = 'NCHW'                                          # 定输出数据的layout,
    compile_options.dump_ir = True                                                  # 指定是否dump IR, 默认为False
    compile_options.dump_asm = True                                                 # 指定是否dump asm汇编文件, 默认为True
    compile_options.dump_dir = dump_dir                                             # 前面指定dump_ir等开关后, 这里指定dump的目录, 默认为"tmp"

    # compiler
    compiler = nncase.Compiler(compile_options)

    # import
    model_content = read_model_file(model_file)                                     # 读取onnx模型
    import_options = nncase.ImportOptions()
    compiler.import_onnx(model_content, import_options)

    # ptq_options
    ptq_options = nncase.PTQTensorOptions()

    ptq_options.quant_type = "uint8" # datatype : "float32", "int8", "int16"
    ptq_options.w_quant_type = "uint8"  # datatype : "float32", "int8", "int16"
    ptq_options.calibrate_method = "NoClip" # "Kld"
    ptq_options.finetune_weights_method = "NoFineTuneWeights"
    ptq_options.dump_quant_error = False
    ptq_options.dump_quant_error_symmetric_for_signed = False

    # detail in docs/MixQuant.md
    ptq_options.quant_scheme = ""
    ptq_options.export_quant_scheme = False
    ptq_options.export_weight_range_by_channel = False

    ptq_options.samples_count = 6
    ptq_options.set_tensor_data(generate_data(input_shape, ptq_options.samples_count, args.dataset))

    # ptq_options = nncase.PTQTensorOptions()
    # ptq_options.samples_count = 6                                                   # 样本个数
    # ptq_options.set_tensor_data(generate_data(input_shape, ptq_options.samples_count, args.dataset)) # 设置tensor数据
    compiler.use_ptq(ptq_options)

    # compile
    compiler.compile()

    # kmodel
    kmodel = compiler.gencode_tobytes()                                             # 生成kmodel字节流
    with open('test.kmodel', 'wb') as f:                    # 保存kmodel模型
        f.write(kmodel)

if __name__ == '__main__':
    main()

Environment (please complete the following information):

curioyang commented 1 year ago

@wozwdaqian onnx模型上传一下

wozwdaqian commented 1 year ago

@wozwdaqian onnx模型上传一下

@curioyang yolov5s.zip yolov5s_pt2onnx.zip yolov5s.onnx是我下载官方的onnx模型,yolov5s_pt2onnx.onnx是我用官方的yolov5s.pt使用python3 export.py --weights yolov5s_pt2onnx.pt --include onnx --img 320生成的。其中yolov5s_pt2onnx.onnx出现是的错误是上面提到的, yolov5s.onnx出现的错误如下

root@RSH-NUC0021:~/k230/k230_sdk-main/src/big/nncase/my_example# python onnx2kmodel.py --target k230 --model yolov5/yolov5s.onnx --dataset calibration_dataset/
warn: Nncase.Hosting.PluginLoader[0]
      NNCASE_PLUGIN_PATH is not set.
/root/k230/k230_sdk-main/src/big/nncase/my_example/onnx2kmodel.py:23: DeprecationWarning: `mapping.TENSOR_TYPE_TO_NP_TYPE` is now deprecated and will be removed in a future release.To silence this warning, please use `helper.tensor_dtype_to_np_dtype` instead.
  input_dict['dtype'] = onnx.mapping.TENSOR_TYPE_TO_NP_TYPE[onnx_type.elem_type]
WARNING: The argument `input_shapes` is deprecated. Please use `overwrite_input_shapes` and/or `test_input_shapes` instead. An error will be raised in the future.
Process terminated. Assertion Failed
   at Nncase.Passes.RewriteProvider.Rewrite(Expr expr, IEnumerable`1 rules, RunPassContext context)
   at Nncase.CompilerServicesProvider.Rewrite(Expr expr, IEnumerable`1 rules, RunPassContext options)
   at Nncase.CompilerServices.Rewrite(Expr expr, IEnumerable`1 rules, RunPassContext options)
   at Nncase.Passes.DataflowPass.RunCoreAsync(BaseFunction function, RunPassContext options)
   at Nncase.Passes.Pass`2.RunAsync(TInput input, RunPassContext context)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Nncase.Passes.Pass`2.RunAsync(TInput input, RunPassContext context)
   at Nncase.Passes.PassManager.FunctionPassGroup.Runner.RunAsync()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Nncase.Passes.PassManager.FunctionPassGroup.Runner.RunAsync()
   at Nncase.Passes.PassManager.FunctionPassGroup.RunAsync(IRModule module)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Nncase.Passes.PassManager.FunctionPassGroup.RunAsync(IRModule module)
   at Nncase.Passes.PassManager.RunAsync(IRModule module)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Nncase.Passes.PassManager.RunAsync(IRModule module)
   at Nncase.Compiler.Compiler.RunPassAsync(Action`1 register, String name)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Nncase.Compiler.Compiler.RunPassAsync(Action`1 register, String name)
   at Nncase.Compiler.Compiler.CompileAsync()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at Nncase.Compiler.Compiler.CompileAsync()
   at Nncase.Compiler.Interop.CApi.CompilerCompile(IntPtr compilerHandle)

Aborted
curioyang commented 1 year ago

@wozwdaqian yolov5s.onnx 中存在float16类型不支持的情况 yolov5s_pt2onnx.onnx 的问题已经修复