dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5k stars 1.33k forks source link

How do I run forward computing with C++? #59

Open FightStone opened 5 years ago

FightStone commented 5 years ago

Thanks to open source code and outstanding contributions. I would like to ask how to deploy yolact in actual production with C++. I want to convert a .pth model file into a .pt model file. Could you give me some reasonable suggestions? and what is the role of ‘use_jit’ in the configuration file? Thank you so much!

dbolya commented 5 years ago

Do you mean through Pytorch 1.0 script? Doing so would essentially require a total rewrite though, as I use too complex python to be able to easily swap it out for torchscript. You could probably interface with python using the Pytorch C++ bindings, but the model would still be run in python (i.e., slower than in native C++).

And what do you mean by converting a .pth file into a .pt file? If I understand this correctly, .pt is just another name for .pth.

I believe I removed use_jit, but basically I already wrapped the simple parts of the model in torchscript (backbone and FPN). Use_jit just specified whether or not to use those torchscript versions. I added it because torchscript does not play nicely with multiple GPUs, but now I've made it automatically disable JIT when it detects multiple GPUs.

FightStone commented 5 years ago

I want to convert a python generated model file (.pth) into a model file (.pt) that C++ can call.(https://pytorch.org/tutorials/advanced/cpp_export.html#) I don't know which method to use. Could you give me some reasonable suggestions?Thanks for your prompt reply!

dbolya commented 5 years ago

Ah I see. Like I said, porting all of Yolact to torchscript would require rewriting almost everything. You might find more luck with tracing, but I think it requires you to return and input only tensors from functions, and I definitely don't do that (so it would also require some rewriting, but much less than if you took the torchscript route). You can try this section, but there will be errors you'll need to fix (at least I think so).

Wilbur529 commented 5 years ago

@FightStone Maybe you can try this work flow: Pytorch->ONNX->NCNN. I have successfully done it, and test the c++ inference code on my ARM device:)

ausk commented 5 years ago

@Wilbur529
When I try to convert yolact from pytorch to onnx, it complains:

Traceback (most recent call last):
  File "D:\Projects\Python\yolcat_seg\test.py", line 238, in <module>
    torch.onnx.export(model, batch, "yolact.onnx", verbose=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\__init__.py", line 27, in export
    return utils.export(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 104, in export
    operator_export_type=operator_export_type)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 281, in _export
    example_outputs, propagate)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 224, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 192, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 197, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace)(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 253, in forward
    out_vars, _ = _flatten(out)
RuntimeError: Only tuples, lists and Variables supported as JIT inputs, but got dict

Can you please show your detail steps to make it work?

Wilbur529 commented 5 years ago

@ausk Hi, There are some key points to note:

  1. Turn off JIT;
  2. Just return the practical output value from the network (instead of the post-processing result of the prediction head);
  3. Rewrite some codes to fix the parameters of some operators, just like protonet, FPN;
  4. Decoding output by yourself( post processing, NMS, and so on).
ausk commented 5 years ago

@Wilbur529 Hi, thank you for your advide and patient. I follow your advice to export it into onnx format.

  1. Turn off JIT: set PYTORCH_JIT environment, and remove @torch.jit.script decorator, and delect JITModule、jit_backbone in yolact.py.
  2. return list of Tensors from PredictionModule::forward and Yolact::forward.
  3. delete decoding operations in Yolact::forward

The input shape is:

torch.Size([1, 3, 550, 550])

The outputs shapes look like ok:

torch.Size([1, 19248, 4])
torch.Size([1, 19248, 81])
torch.Size([1, 19248, 32])
torch.Size([1, 138, 138, 32])

Then when calling torch.onnx.export(model, batch, "yolact.onnx", verbose=True), it complains:

  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 211, in _model_to_graph
    assert example_outputs is not None, "example_outputs must be provided when exporting a ScriptModule"
AssertionError: example_outputs must be provided when exporting a ScriptModule

It seems like that the module is running in JIT or Script mode. But I'm sure the environment is set PYTORCH_JIT=0. I just have no idea how to fix it. Have you encounter such an issue?

Wilbur529 commented 5 years ago

@ausk I turned off JIT by modifying the configure file. https://github.com/dbolya/yolact/blob/13b49d749b734b098a292c8c5226017b344ccc67/data/config.py#L566 Just add this line to your configure dictionary, and change it to True.

ausk commented 5 years ago

@Wilbur529 I forgot to say that I modified the line. I'm still trying to debug it. Thank you for you time. :)

sdimantsd commented 5 years ago

Hi @ausk , Did you succeed in converting the weights to ONNX?

abhigoku10 commented 5 years ago

@Wilbur529 @ausk Hello were you able to successfully convert the yolact model to ONX since i am also facing the same issues during conversion , would you be able to share the converted model and elaborate the steps of conversion that would be much helpful

Wilbur529 commented 5 years ago

@abhigoku10 Since the converted model may not help you, maybe you could share the problem description you met with us.

FightStone commented 5 years ago

@Wilbur529 Hi, first of all, because of your success, it has given me the confidence to explore. :) :)

Secondly, I modified the code in yolact.py like this: return (pred_outs['loc'], pred_outs['conf'], pred_outs['mask'], pred_outs['proto']) not return self.detect(pred_outs)

Third, I modified the code like this to generate a forward model via Tracing: traced_script_module = torch.jit.trace(net, batch) traced_script_module.save("xxx/model.pt")

Fourth, I success generate a forward model named model.pt.

But, when i do inference like this: std::vector inputs; inputs.push_back(torch::ones({1, 3, 550, 550}).to(at::kCUDA)); at::Tensor output = module->forward(inputs).toTensor();

There was an error: terminate called after throwing an instance of 'c10::Error' what(): isTensor() INTERNAL ASSERT FAILED at /home/sn19038157/libtorch/include/ATen/core/ivalue_inl.h:119, please report a bug to PyTorch. (toTensor at /home/sn19038157/libtorch/include/ATen/core/ivalue_inl.h:119)

what should I do? I would be grateful if I could help me a little. :( :(

Wilbur529 commented 5 years ago

@FightStone I think you are going to try the pytorch->caffe2->C++ work flow, which i haven't tried. Maybe you can try to update Pytorch to the latest version. But i still suggest you tring Pytorch->ONNX->NCNN, because NCNN is a high-performance NN inference computing framework.

FightStone commented 5 years ago

@Wilbur529 Can you share your forward processing code? Or introduce some classic C++ forward processing code examples? Because the relevant information is not a lot.

abhigoku10 commented 5 years ago

@ausk Hi, There are some key points to note:

  1. Turn off JIT;
  2. Just return the practical output value from the network (instead of the post-processing result of the prediction head);
  3. Rewrite some codes to fix the parameters of some operators, just like protonet, FPN;
  4. Decoding output by yourself( post processing, NMS, and so on).

@Wilbur529 can you share the modified part of the code , since step1 of tunoff JIT is done and the other steps its a bit confusing . either would be able to elaborate the changes to be made

Wilbur529 commented 5 years ago

@FightStone For pre-processing, it's just a simple normalization. And for post-processing, you could refer to the implementation of NCNN(https://github.com/Tencent/ncnn/blob/master/src/layer/detectionoutput.cpp):)

Wilbur529 commented 5 years ago

@abhigoku10 Sry for that i could not show you this part of code because of the company rules. About the second point, you could return a list of these outputs. The third point, since the ONNX framework only supports fixing net parameters, so you need to change some variables to constants. The last one, refer to the post-processing python code of YOLACT or other implementation, and translate them to C/C++. May success wait upon your efforts:)

abhigoku10 commented 5 years ago

@Wilbur529 shall try to do this , how much time do u think will it take to do the whole process of conversion . When converted was there any change in the fps values ?

@FightStone please share the code base which you have modified

Wilbur529 commented 5 years ago

@abhigoku10 The fps depends on the hardware environment. So only experiment will tell you the answer:)

abhigoku10 commented 5 years ago

@Wilbur529 yup rightly said, so for ur hardware env what was the fps u were achieving

Wilbur529 commented 5 years ago

@abhigoku10 I run YOLACT(ResNet-101) on my MacBook Pro with 550*550 input size, costing around 1.5s per frame.

abhigoku10 commented 5 years ago

@Wilbur529 thanks for the response i am interested to know about mobilenet architecture fps

ashank-art commented 5 years ago

@Wilbur529 When I try to convert yolact from pytorch to onnx, it complains:

Traceback (most recent call last):
  File "D:\Projects\Python\yolcat_seg\test.py", line 238, in <module>
    torch.onnx.export(model, batch, "yolact.onnx", verbose=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\__init__.py", line 27, in export
    return utils.export(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 104, in export
    operator_export_type=operator_export_type)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 281, in _export
    example_outputs, propagate)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 224, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 192, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 197, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace)(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 253, in forward
    out_vars, _ = _flatten(out)
RuntimeError: Only tuples, lists and Variables supported as JIT inputs, but got dict

Can you please show your detail steps to make it work?

Did you make the Yolact code cpompatible to run in CPU and then you got this error while trying to convert Pytorch model to ONNX format?

Wilbur529 commented 5 years ago

@ashank-art Please follow my second key point. Replace the output with a list of tensors that have no post-processing.

ridasalam commented 5 years ago

Has anyone tried Pytorch to Tensorflow conversion for Yolact?

abhigoku10 commented 5 years ago

@ridasalam nope i have not tried tensorflow currently trying with onx , let me knwo if you have tried tf #74

sdimantsd commented 4 years ago

Sup? anything new with the Pytorch to Tensorflow conversion for Yolact?

vatsalkansara2 commented 4 years ago

Hi, @FightStone I am getting same INTERNAL ASSERT FAILED .

Any updates?

dzyjjpy commented 4 years ago

@Wilbur529 I change model output from dict to list now, and can get onnx converted file. I am not sure the onnx file is right or not. From the comments you have mentioned above,
I am confused how to do the third point, as I have relevant errors about it. When I convert onnx to mnn, it shows: type=slice, failed, may be some node is not const And I check pytorch to onnx process, the slice op is 873 and 874, it is a slice operation for: scores, idx2 = scores.sort(0, descending=True) idx2 = idx2[:cfg.max_num_detections] scores = scores[:cfg.max_num_detections] (before this line, there is a proposal number k is not a fixed number , input is random variable 5505503 ) Do you have some advice? Thank you. I don't know how to modify this part.

dzyjjpy commented 4 years ago

@Wilbur529 Hi, thank you for your advide and patient. I follow your advice to export it into onnx format.

  1. Turn off JIT: set PYTORCH_JIT environment, and remove @torch.jit.script decorator, and delect JITModule、jit_backbone in yolact.py.
  2. return list of Tensors from PredictionModule::forward and Yolact::forward.
  3. delete decoding operations in Yolact::forward

The input shape is:

torch.Size([1, 3, 550, 550])

The outputs shapes look like ok:

torch.Size([1, 19248, 4])
torch.Size([1, 19248, 81])
torch.Size([1, 19248, 32])
torch.Size([1, 138, 138, 32])

Then when calling torch.onnx.export(model, batch, "yolact.onnx", verbose=True), it complains:

  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 211, in _model_to_graph
    assert example_outputs is not None, "example_outputs must be provided when exporting a ScriptModule"
AssertionError: example_outputs must be provided when exporting a ScriptModule

It seems like that the module is running in JIT or Script mode. But I'm sure the environment is set PYTORCH_JIT=0. I just have no idea how to fix it. Have you encounter such an issue?

Hi, Have you solved the issue? I met it before, you can change the pytorch version, and I can use pytorch1.2 to convert with no this issue.

Besides, can you get the converted onnx model with no problem right now? I can convert pytorch model to onnx, but it fails when excute onnx check_model step. And the onnx file can not convert to mnn file as slice op failed, it shows: slice must be a const.

looking forward to your reply

dzyjjpy commented 4 years ago

@abhigoku10 Sry for that i could not show you this part of code because of the company rules. About the second point, you could return a list of these outputs. The third point, since the ONNX framework only supports fixing net parameters, so you need to change some variables to constants. The last one, refer to the post-processing python code of YOLACT or other implementation, and translate them to C/C++. May success wait upon your efforts:)

@Wilbur529 have you compare the map between onnx and pytorch? according to test result in my loacl coco val test, the mAP for mask decrease more than 4.3 points from almost 30.0 to 25.4

xinheblue commented 4 years ago

@ausk Hi, There are some key points to note:

  1. Turn off JIT;
  2. Just return the practical output value from the network (instead of the post-processing result of the prediction head);
  3. Rewrite some codes to fix the parameters of some operators, just like protonet, FPN;
  4. Decoding output by yourself( post processing, NMS, and so on).

converted to onnx, and then converted to ncnn, it complains

Shape not supported yet! Gather not supported yet!

axis=0

Cast not supported yet!

to=1

Cast not supported yet!

to=1

Shape not supported yet! Gather not supported yet!

axis=0

Cast not supported yet!

to=1

Cast not supported yet!

to=1

Unsupported unsqueeze axes ! Unsupported unsqueeze axes ! Cast not supported yet!

to=1

Shape not supported yet! Unsupported reduction axes ! Cast not supported yet!

to=1

How you solved the problem? Thanks!

nihui commented 4 years ago

I have put yolact example in ncnn project https://github.com/Tencent/ncnn/commit/043a8f1ac1293f78f5f6bafbd8710c0f7cf9dade

besides, you can learn how to achieve yolact inference with ncnn library here ( in chinese ) https://zhuanlan.zhihu.com/p/128974102

dzyjjpy commented 4 years ago

I have put yolact example in ncnn project Tencent/ncnn@043a8f1

besides, you can learn how to achieve yolact inference with ncnn library here ( in chinese ) https://zhuanlan.zhihu.com/p/128974102

Hi @nihui , thanks for your reply. I have run compressed yolact model on cellphone by mnn a month ago. I will compare the time cost. BTW, ncnn is a great job

abhigoku10 commented 4 years ago

@dzyjjpy @nihui can you both please share the process and code to run on the mobile device , is it android or ios

leung-yaya commented 4 years ago

@Wilbur529 Hi,how big you onnx model(ResNet-101)?

dinhphuong98 commented 4 years ago

@FightStone Maybe you can try this work flow: Pytorch->ONNX->NCNN. I have successfully done it, and test the c++ inference code on my ARM device:)

can you public the detail your code and how to install it? thanks

dinhphuong98 commented 4 years ago

@FightStone Maybe you can try this work flow: Pytorch->ONNX->NCNN. I have successfully done it, and test the c++ inference code on my ARM device:)

can you public the detail your code and how to install it? thanks

or a turotial, how can i install it on my device?

abhigoku10 commented 4 years ago

@FightStone it will be good to share the process of conversion thanks , where you successfull in converting yolact++ also ??

piaoliuping123 commented 4 years ago

谢谢dalao, 希望一起交流学习 qq:975966355

stereomatchingkiss commented 2 years ago

I have put yolact example in ncnn project Tencent/ncnn@043a8f1

besides, you can learn how to achieve yolact inference with ncnn library here ( in chinese ) https://zhuanlan.zhihu.com/p/128974102

我根据这两个帖子,成功的把模型转成了onnx,https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_YOLACT.html and https://github.com/Ma-Dan/yolact/tree/onnx

问题是转换成onnx后,速度降低不少(30ms vs 125ms),主要问题出在onnx需要把资料从gpu搬回cpu上,我尝试过利用torchvision.nms处理nms,但是运算速度更慢(高于1000ms),不知道哪里出了问题,在知乎上看见一篇文章提到(请问是大佬你写的吗?),“后处理在onnx中会转换成一大坨胶水op,非常琐碎,在框架中实现效率低下”, 是onnx或pytorch,或两者皆是,在nms这块的转换上,处理的不好?

danielhuang2020 commented 2 years ago

@stereomatchingkiss 可以提供一下pytorch版本么,我尝试把模型转成onnx,却失败了 出现 “Could not convert to integer: 3221225477. Path 'exitCode'.”这个错误 参考https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_YOLACT.html https://github.com/Ma-Dan/yolact/tree/onnx

stereomatchingkiss commented 2 years ago

@stereomatchingkiss 可以提供一下pytorch版本么,我尝试把模型转成onnx,却失败了 出现 “Could not convert to integer: 3221225477. Path 'exitCode'.”这个错误 参考https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_YOLACT.html https://github.com/Ma-Dan/yolact/tree/onnx

我是用LTS version转的,1.8.2,最新版的export onnx有点bug