clovaai / CRAFT-pytorch

Official implementation of Character Region Awareness for Text Detection (CRAFT)
MIT License
3.07k stars 869 forks source link

Export to ONNX #4

Closed hiepph closed 4 years ago

hiepph commented 5 years ago

I'm trying to export from pth to ONNX format:

import torch
from torch.autograd import Variable
import cv2
import imgproc
from craft import CRAFT

# load net
net = CRAFT()     # initialize
net = net.cuda()
net = torch.nn.DataParallel(net)


# load data
image = imgproc.loadImage('./misc/test.jpg')

# resize
img_resized, target_ratio, size_heatmap = imgproc.resize_aspect_ratio(image, 1280, interpolation=cv2.INTER_LINEAR, mag_ratio=1.5)
ratio_h = ratio_w = 1 / target_ratio

# preprocessing
x = imgproc.normalizeMeanVariance(img_resized)
x = torch.from_numpy(x).permute(2, 0, 1)    # [h, w, c] to [c, h, w]
x = Variable(x.unsqueeze(0))                # [c, h, w] to [b, c, h, w]
x = x.cuda()

# trace export

But then encountered this error:

RuntimeError: tuple appears in op that does not forward tuples (VisitNode at /opt/conda/conda-bld/pytorch_1556653114079/work/torch/csrc/jit/passes/lower_tuples.cpp:117)

Followed these issue and, it turnned out that nn.DataParallel wrapper doesn't support trace export for ONNX.

Is there a workaround for this?

YoungminBaek commented 5 years ago

I am not familiar with ONNX, and have no exact answer to your issue. However, I updated the code to solve the issue regarding loading the model file. Could you try to export to ONNX file using the newly added copyStateDict function?

hiepph commented 5 years ago

Thank @YoungminBaek for the quick response. The problem with nn.DataParallel disappeared after your fix, but a new error appears:

RuntimeError: Failed to export an ONNX attribute, since it's not constant, please try to make things (e.g., kernel size) static if possible

It seems like a problem related to some operations in CRAFT model (after VGG16 basenet) which are not compatible with ONNX (??)

ganliqiang commented 5 years ago

Thank @YoungminBaek for the quick response. The problem with nn.DataParallel disappeared after your fix, but a new error appears:

RuntimeError: Failed to export an ONNX attribute, since it's not constant, please try to make things (e.g., kernel size) static if possible

It seems like a problem related to some operations in CRAFT model (after VGG16 basenet) which are not compatible with ONNX (??)

hi,@hiepph have you convert pth to onnx successfully? does the above code works well?

ghost commented 5 years ago

@hiepph did you gain any inference speed in onnx ?

hiepph commented 5 years ago

hi,@hiepph have you convert pth to onnx successfully? does the above code works well?

@ganliqiang Nope, I'm still stuck with the same error ;(

RuntimeError: Failed to export an ONNX attribute, since it's not constant, please try to make things (e.g., kernel size) static if possible

hiepph commented 5 years ago

@hiepph did you gain any inference speed in onnx ?

@deepseek Yep, many serving framework supports optimized graph of ONNX (e.g. onnxruntime). I did with some experiments (with other models) and practically faster than raw pth checkpoint.

ajinkya933 commented 5 years ago

@YoungminBaek @ClovaAIAdmin

I'm trying to export from pth to ONNX format:

import torch
from torch.autograd import Variable
import cv2
import imgproc
from craft import CRAFT

# load net
net = CRAFT()     # initialize
net = net.cuda()
net = torch.nn.DataParallel(net)


# load data
image = imgproc.loadImage('./misc/test.jpg')

# resize
img_resized, target_ratio, size_heatmap = imgproc.resize_aspect_ratio(image, 1280, interpolation=cv2.INTER_LINEAR, mag_ratio=1.5)
ratio_h = ratio_w = 1 / target_ratio

# preprocessing
x = imgproc.normalizeMeanVariance(img_resized)
x = torch.from_numpy(x).permute(2, 0, 1)    # [h, w, c] to [c, h, w]
x = Variable(x.unsqueeze(0))                # [c, h, w] to [b, c, h, w]
x = x.cuda()

# trace export

But then encountered this error:

RuntimeError: tuple appears in op that does not forward tuples (VisitNode at /opt/conda/conda-bld/pytorch_1556653114079/work/torch/csrc/jit/passes/lower_tuples.cpp:117)

Followed these issue pytorch/pytorch#5315 and pytorch/pytorch#13397, it turnned out that nn.DataParallel wrapper doesn't support trace export for ONNX.

Is there a workaround for this?

I followed your script to export model in onnx format. However I got a new error:

Traceback (most recent call last):
  File "", line 34, in <module>
  File "/home/ubuntu/anaconda3/envs/pytorch36/lib/python3.6/site-packages/torch/onnx/", line 132, in export
    strip_doc_string, dynamic_axes)
  File "/home/ubuntu/anaconda3/envs/pytorch36/lib/python3.6/site-packages/torch/onnx/", line 64, in export
    example_outputs=example_outputs, strip_doc_string=strip_doc_string, dynamic_axes=dynamic_axes)
  File "/home/ubuntu/anaconda3/envs/pytorch36/lib/python3.6/site-packages/torch/onnx/", line 311, in _export
    raise ValueError('torch.nn.DataParallel is not supported by ONNX '
ValueError: torch.nn.DataParallel is not supported by ONNX exporter, please use 'attribute' module to unwrap model from torch.nn.DataParallel. Try torch.onnx.export(model.module, ...)
piernikowyludek commented 5 years ago

@hiepph Hi, if you still haven't managed to convert the model to ONNX you may find this thread helpful: I converted the model successfully to .onnx now. It is very much an onnx library problem.

I do get stuck at the next step though - converting the .onnx to .pb file ;D So if anyone crosses that bridge, I will appreciate your help!

ajinkya933 commented 4 years ago

@piernikowyludek one of the problem that I have is this model takes data serially and outputs it serially (in a queue). Does converting it to onnx make parallel execution possible ?

ajinkya933 commented 4 years ago

I have exported this graph to onnx and Ive added the details on how to do it in my fork here:

Hope it helps.

If anyone knows how to take inference from it pl tell me

hiepph commented 4 years ago

Thank @piernikowyludek and @ajinkya933 for providing the solution. I managed to export successfully.

Here a sample script to infer with onnxruntime:

import torch
import cv2
import onnxruntime as rt

import craft_utils
import imgproc

sess = rt.InferenceSession("onnx/craft.onnx")
input_name = sess.get_inputs()[0].name

img = cv2.imread('./test.jpg')
img_resized, target_ratio, size_heatmap = imgproc.resize_aspect_ratio(img, 1280, interpolation=cv2.INTER_LINEAR, mag_ratio=1.5)
ratio_h = ratio_w = 1 / target_ratio
x = imgproc.normalizeMeanVariance(img_resized)
x = torch.from_numpy(x).permute(2, 0, 1)    # [h, w, c] to [c, h, w]
x = x.unsqueeze(0)                # [c, h, w] to [b, c, h, w]

y, _ =, {input_name: x.numpy()})

# make score and link map
score_text = y[0, :, :, 0]
score_link = y[0, :, :, 1]

# Post-processing
boxes = craft_utils.getDetBoxes(score_text, score_link, 0.5, 0.4, 0.4)
boxes = craft_utils.adjustResultCoordinates(boxes, ratio_w, ratio_h)
ApurvaDani commented 4 years ago

So, I exported to onnx successfully But when I try to infer using the model, it fails if the image size is different from the one I provided during the export. How are we handling dynamic input size since Craft can work with images of various sizes.

bharatsubedi commented 3 years ago

@ApurvaDani did you solved the problem of dynamic input size? could you provide the solution?

SaeedArisha commented 3 years ago

@ApurvaDani did you solved the problem of dynamic input size? could you provide the solution?

Running into same issue, any help?

bharatsubedi commented 3 years ago

@SaeedArisha check my code its works for me.

`import torch from torch.autograd import Variable import cv2 import imgproc import os from craft import CRAFT from collections import OrderedDict os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0" def copyStateDict(state_dict): if list(state_dict.keys())[0].startswith("module"): start_idx = 1 else: start_idx = 0 new_state_dict = OrderedDict() for k, v in state_dict.items(): name = ".".join(k.split(".")[start_idx:]) new_state_dict[name] = v return new_state_dict

net = CRAFT()
net.load_state_dict(copyStateDict(torch.load('best_weights/14_best_20210622.pth'))) net = net.cuda()

input_batch = 1 input_channel = 3 input_h = 768 #448 input_w = 768 #448 output_batch = input_batch output_h = input_h / 2 output_w = input_w / 2 inputc = torch.randn(input_batch, input_channel, \ input_h, input_w, device='cuda')

outputc = net(inputc.cuda()) output_names = ["output1","output2"] input_names = ["inputc"] dynamic_axes = {'inputc': { 2: "inputc_h", 3: 'inputc_w'},'output1': { 2: "output1_h", 3: 'output1_w'} , 'output2': { 2: "output2_h", 3: 'output2_w'}}

torch.onnx.export( net, inputc, os.path.join('best_weights','14_best.onnx'), export_params=True, # store the trained parameter weights inside the model file opset_version=10, # the ONNX version to export the model to do_constant_folding=True, # whether to execute constant folding for optimization verbose=True, input_names=['inputc'], output_names=['output1','output2'], dynamic_axes=dynamic_axes, )

net = torch.nn.DataParallel(net)

net.eval() `

SaeedArisha commented 3 years ago

@bharatsubedi Thank you, I'm trying this. Did the inference time improve through this?

zoldaten commented 2 years ago

hi! i`ve exported craft to onnx with this code:

import torch
import torch.nn as nn
import torch.nn.functional as F
import onnx

# import craft functions
from craft_text_detector import (    
# load models
craft_net = load_craftnet_model(cuda=False)

dummy_input = torch.randn(1, 3, 480, 640)
torch_out = torch.onnx.export(craft_net, dummy_input, 'model.onnx', verbose=True, opset_version=11)

onnx_model = onnx.load('model.onnx')

*480,640 because i have errors with inputs.

Than i use the model.onnx with this code:

from craft_text_detector import export_detected_regions
import torch
import cv2
import onnxruntime as rt

import craft_utils
import imgproc

sess = rt.InferenceSession("model.onnx")
input_name = sess.get_inputs()[0].name

output_dir = 'outputs/'

img = cv2.imread('frame1.jpg')
#print(img.shape) (1944, 2592, 3)
img_resized, target_ratio, size_heatmap = imgproc.resize_aspect_ratio(img, 640, interpolation=cv2.INTER_LINEAR, mag_ratio=1.5)

ratio_h = ratio_w = 1 / target_ratio
x = imgproc.normalizeMeanVariance(img_resized)
x = torch.from_numpy(x).permute(2, 0, 1)    # [h, w, c] to [c, h, w]
x = x.unsqueeze(0)                # [c, h, w] to [b, c, h, w]

y, _ =, {input_name: x.numpy()})

# make score and link map
score_text = y[0, :, :, 0]
score_link = y[0, :, :, 1]

boxes = craft_utils.getDetBoxes(score_text, score_link, 0.7, 0.4, 0.4)
boxes = craft_utils.adjustResultCoordinates(boxes, ratio_w, ratio_h)

print( start)

But crashed at line: boxes = craft_utils.adjustResultCoordinates(boxes, ratio_w, ratio_h)

Traceback (most recent call last):
  File "/home/pi/CRAFT-pytorch/", line 34, in <module>
    boxes = craft_utils.adjustResultCoordinates(boxes, ratio_w, ratio_h)
  File "/home/pi/CRAFT-pytorch/", line 242, in adjustResultCoordinates
    polys[k] *= (ratio_w * ratio_net, ratio_h * ratio_net)
ValueError: operands could not be broadcast together with shapes (110,) (2,) (110,) 

I think something went wrong in conversion to onnx, but i didnt get any error ( Could you help please ?

kishcs commented 2 years ago

I have exported this graph to onnx and Ive added the details on how to do it in my fork here:

Hope it helps.

If anyone knows how to take inference from it pl tell me

As per the above link (, it does not show steps to convert pth model to onnx model. Can you please provide some steps or guide for conversion? That will be great help.

zoldaten commented 2 years ago

actually, i used code from this repo - only u need to correct dummy_input = torch.randn(1, 3, 480, 640) where 480 ,640 - height and width of input of image. i could be whatever u want, because in inference it downsized. This parameters influence on total speed perfomance of craft. But if u put too small image size - the algorithm wont work. For raspberry pi 4b 384x512 is optimal (my opinion).

The bharatsubedi code also works. There u put like this:

input_h = 512 #448
input_w = 384 #448

Finally, on inference, u should use bigger image size. In this case put 1000:

img_resized, target_ratio, size_heatmap = imgproc.resize_aspect_ratio(img, 1000, \
                                    interpolation=cv2.INTER_LINEAR, mag_ratio=1.5)

The full code of inference look like this( remember i use mixed up packeges from repo above for easy export):

from datetime import datetime
start =

from craft_text_detector import export_detected_regions
import torch
import cv2
import onnxruntime as rt

import craft_utils
import imgproc

#refine_net = load_refinenet_model(cuda=False)

#sess = rt.InferenceSession("model_640_480.onnx") #resize to 320x240 - bad results
sess = rt.InferenceSession("model_512_384.onnx") #1000x960 resize to 512x384
input_name = sess.get_inputs()[0].name

output_dir = 'outputs/'

img = cv2.imread('frame1.jpg')
#print(img.shape) (1944, 2592, 3)
img_resized, target_ratio, size_heatmap = imgproc.resize_aspect_ratio(img, 1000, \
                                    interpolation=cv2.INTER_LINEAR, mag_ratio=1.5)

ratio_h = ratio_w = 1 / target_ratio
x = imgproc.normalizeMeanVariance(img_resized)
x = torch.from_numpy(x).permute(2, 0, 1)    # [h, w, c] to [c, h, w]
x = x.unsqueeze(0)                # [c, h, w] to [b, c, h, w]

y, feature =, {input_name: x.numpy()})

# make score and link map
score_text = y[0, :, :, 0]
score_link = y[0, :, :, 1]

# refine link
#with torch.no_grad():
#    y_refiner = refine_net(y, feature)
#score_link = y_refiner[0,:,:,0].cpu().data.numpy()

boxes, polys = craft_utils.getDetBoxes(score_text, score_link, 0.7, 0.4, 0.4, True)

#boxes = craft_utils.getDetBoxes(score_text, score_link, 0.7, 0.4, 0.4)
polys = craft_utils.adjustResultCoordinates(polys, ratio_w, ratio_h)
boxes = craft_utils.adjustResultCoordinates(boxes, ratio_w, ratio_h)

print( start)


import file_utils
file_utils.saveResult('outputs/', img[:,:,::-1], boxes, dirname=output_dir)

*i can't won with polys in code - it gives error ( so i use boxes. **next step is how to glue up REFINET to code ))

And yes, onnx model works faster than original CRAFT model (time of model load and inference on raspberry pi4b): onnx: craft as is:

NeoDhirendra commented 1 year ago

has someone tried generating inference using opencv dnn readnet.