Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
19.81k stars 4.11k forks source link

ncnnoptimize Segmentation fault #2673

Open chingi071 opened 3 years ago

chingi071 commented 3 years ago

您好,我照著您的步驟將 yolact 模型做轉換: https://zhuanlan.zhihu.com/p/128974102 在 0x3 去掉后处理导出onnx 這個步驟時,出現了以下錯誤,請問該怎麼解決呢? 謝謝您。

Multiple GPUs detected! Turning off JIT.
Config not specified. Parsed yolact_resnet50_config from the file name.

Loading model... Done.
/home/joy/yolact/yolact.py:222: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if self.last_img_size != (cfg._tmp_img_w, cfg._tmp_img_h):
Traceback (most recent call last):
  File "eval.py", line 1107, in <module>
    evaluate(net, dataset)
  File "eval.py", line 883, in evaluate
    evalimage(net, args.image)
  File "eval.py", line 601, in evalimage
    torch.onnx._export(net, batch, "yolact.onnx", export_params=True, keep_initializers_as_inputs=True, opset_version=11)
  File "/home/aderd/anaconda3/envs/pytorch_v37/lib/python3.7/site-packages/torch/onnx/__init__.py", line 28, in _export
    result = utils._export(*args, **kwargs)
  File "/home/aderd/anaconda3/envs/pytorch_v37/lib/python3.7/site-packages/torch/onnx/utils.py", line 530, in _export
    fixed_batch_size=fixed_batch_size)
  File "/home/aderd/anaconda3/envs/pytorch_v37/lib/python3.7/site-packages/torch/onnx/utils.py", line 409, in _model_to_graph
    _export_onnx_opset_version)
RuntimeError: All input tensors must be on the same device. Received cuda:0 and cpu
chingi071 commented 3 years ago

我将torch版本改为 1.5.0, torchvision 0.6.0 可以成功转换onnx (原版本 torch 1.6.0, torchvision 0.7.0) 在 0x5 ncnn模型转换和优化的部分,执行 onnx2ncnn 可以成功输出yolact.param, yolact.bin, 但执行 ./ncnnoptimize yolact.param yolact.bin yolact-opt.param yolact-opt.bin 0 指令会出现Segmentation fault (core dumped) 的错误讯息,请问该怎么解决呢? 谢谢您。

chingi071 commented 3 years ago

@nihui 您好,我照着您的步骤对 yolact 模型做转换,但遇到了以下问题。想请问一下,以下是转换为onnx时的错误讯息,会是因为这个原因导致 ncnnoptimize 出错吗? 以下的指令可以成功输出onnx,但执行 ncnnoptimize 会出现Segmentation fault 的错误讯息。 我的环境用过: -torch 1.5.0, torchvision 0.7.0, (python 3.7, python3.6), onnx 1.8.1 -torch 1.6.0, torchvision 0.7.0 (error: RuntimeError: All input tensors must be on the same device. Received cuda:0 and cpu) -torch 1.7.0, torchvision 0.8.2, python 3.7, onnx 1.8.1 我也有将pillow版本改为7.0以下 (6.2.2),但都有一样的问题... 想请问一下您当初的环境是什么版本呢? 非常感谢ncnn团队的付出,谢谢。

$ python eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.15 --top_k=15 --image=test.jpg
Multiple GPUs detected! Turning off JIT.
Config not specified. Parsed yolact_resnet50_config from the file name.

Loading model... Done.
/home/joy/yolact_v2/yolact.py:221: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if self.last_img_size != (cfg._tmp_img_w, cfg._tmp_img_h):
john-lihn commented 3 years ago

你要使用看看官方pytorch版本1.0.1嗎

nihui commented 3 years ago

请尝试更新下 onnx2ncnn 和 ncnnoptimize 代码,最近修了几个 bug ....

john-lihn commented 3 years ago

@nihui 謝謝 我會在試試看的 辛苦了

chingi071 commented 3 years ago

@nihui 非常感謝您! 我再試試看。

chingi071 commented 3 years ago

@nihui 您好,非常感谢您的帮助。但我在执行 ncnnoptimize 一样会出现Segmentation fault 的错误讯息,操作步骤都是一样的。 我的环境为 torch 1.5.0, torchvision 0.7.0, python 3.7, onnx 1.8.1 想请问是因为环境的关系吗,再麻烦您了,非常感谢。

FeiMiBa commented 3 years ago

@nihui 类似的问题 pytorch -> onnx -> 最新版onnx sim -> 最新版onnx2ncnn成功 -> 该版本下ncnnoptimize失败: segmentation fault (core dumped) pytorch -> onnx -> 最新版onnx sim -> 2020年某版本onnx2ncnn成功 -> 该版本下ncnnoptimize成功

最后ncnnoptimize成功后的模型目前看起来正常

huyunlei commented 3 years ago

@FeiMiBa 大哥你真是好人啊,ncnn版本确实有影响,十分感谢.看见fault真的很绝望,也不会改,我今天能看到你这回答真的幸运极了,十分感谢,