grimoire / mmdetection-to-tensorrt

convert mmdetection model to tensorrt, support fp16, int8, batch input, dynamic shape etc.
Apache License 2.0
587 stars 85 forks source link

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) #3

Closed cefengxu closed 4 years ago

cefengxu commented 4 years ago

when using code on pyCharm as follow:

import torch
import mmdet2trt

cfg_path = 'faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
weight_path = 'faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
save_path = 'helloworld.trt'
opt_shape_param=[
    [   [1,3,320,320],      # min shape
        [1,3,800,1344],     # optimize shape
        [1,3,1344,1344],    # max shape ] ]
max_workspace_size=1<<30    # some module need large workspace, add workspace size when OOM.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)

however, error log output and Python be quit force. Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

debuging from 'import mmdet2trt' via Python Console and found the code break from import matplotlib.pyplot as plt in ./mmdetection/mmdet/apis/inference.py

Curiously, when i run the demo from mmdetection (mmdetecion/demo/image_demo.py) and it runs normally. At least it should be comfirmed mmdetecion is normally.

So, what's problem about this case ?

grimoire commented 4 years ago

Hi I googled this error : https://github.com/matplotlib/matplotlib/issues/9294 https://www.xspdf.com/help/51312698.html

most of them are caused by QT/matplotlib conflict.

Can you run the code in console(without pycharm or any ide), see if it works.

If the error still exist. please provide detail error log and your enviroment( os, pytorch version, cuda version etc). I will try.

cefengxu commented 4 years ago

Hi I googled this error : matplotlib/matplotlib#9294 https://www.xspdf.com/help/51312698.html

most of them are caused by QT/matplotlib conflict.

Can you run the code in console(without pycharm or any ide), see if it works.

If the error still exist. please provide detail error log and your enviroment( os, pytorch version, cuda version etc). I will try.

yes , i run the code in Console and Terminal , but still get the same error. detail error: not detail error but just the ' Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) '

deve envirooment

on the other hand , i will try the different way:

  1. Annotating the code about matplotlib in mmdetection, It seems complicated !
  2. rebuilding your project from a cleaning docker env;
grimoire commented 4 years ago

OK I will debug on a docker with the enviroment you provided. Please give me some time.

cefengxu commented 4 years ago

Sorry. I think the error may be caused by the backend of matplotlib. When the backend be changed to 'Agg', the code is run normally.

However, another error output. I will update a new issue to describe it.

Thanks man~