Media-Smart / vedastr

A scene text recognition toolbox based on PyTorch
Apache License 2.0
535 stars 100 forks source link

Failed to export an ONNX attribute 'onnx::Gather', since it's not constant, please try to make things (e.g., kernel size) static if possible #87

Open choozhenbo opened 2 years ago

choozhenbo commented 2 years ago

As I want to convert the model resnet-bilstm-ctc to ONNX format, I face the error message "RuntimeError: Unsupported: ONNX export of operator adaptive pooling, since output_size is not constant.. Please feel free to request support or submit a pull request on PyTorch GitHub". The complete error message: python tools/torch2onnx.py configs/resnet_bilstm_ctc.py /home/tham/Desktop/convert/text-recognition-resnet-bilstm-ctc/vedastr/ckpt/best_acc.pth /home/tham/Desktop/convert/text-recognition-resnet-bilstm-ctc/resnet_bilstm_ctc.onnx 2022-06-29 08:56:11,069 - INFO - Use GPU 0 2022-06-29 08:56:11,069 - INFO - Set cudnn deterministic False 2022-06-29 08:56:11,069 - INFO - Set cudnn benchmark True 2022-06-29 08:56:11,069 - INFO - Set seed 1111 2022-06-29 08:56:11,069 - INFO - Build model 2022-06-29 08:56:11,275 - INFO - GResNet init weights 2022-06-29 08:56:11,470 - INFO - CTCHead init weights 2022-06-29 08:56:13,259 - INFO - Load checkpoint from /home/tham/Desktop/convert/text-recognition-resnet-bilstm-ctc/vedastr/ckpt/best_acc.pth Convert to Onnx with constant input shape 3,32,100 and opset version 11 Traceback (most recent call last): File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/symbolic_opset9.py", line 968, in symbolic_fn output_size = _parse_arg(output_size, "is") File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 84, in _parse_arg "', since it's not constant, please try to make " RuntimeError: Failed to export an ONNX attribute 'onnx::Gather', since it's not constant, please try to make things (e.g., kernel size) static if possible

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "tools/torch2onnx.py", line 92, in main() File "tools/torch2onnx.py", line 83, in main do_constant_folding=args.do_constant_folding, File "/home/tham/vedastr/tools/volksdep/converters/torch2onnx.py", line 62, in torch2onnx dynamic_axes=dynamic_axes) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/init.py", line 320, in export custom_opsets, enable_onnx_checker, use_external_data_format) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/utils.py", line 111, in export custom_opsets=custom_opsets, use_external_data_format=use_external_data_format) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/utils.py", line 729, in _export dynamic_axes=dynamic_axes) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/utils.py", line 501, in _model_to_graph module=module) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/utils.py", line 216, in _optimize_graph graph = torch._C._jit_pass_onnx(graph, operator_export_type) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/init.py", line 373, in _run_symbolic_function return utils._run_symbolic_function(*args, *kwargs) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/utils.py", line 1032, in _run_symbolic_function return symbolic_fn(g, inputs, **attrs) File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/symbolic_opset9.py", line 970, in symbolic_fn return sym_help._onnx_unsupported("adaptive pooling, since output_size is not constant.") File "/home/tham/anaconda3/envs/vedastr/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py", line 263, in _onnx_unsupported "Please feel free to request support or submit a pull request on PyTorch GitHub.".format(op_name)) RuntimeError: Unsupported: ONNX export of operator adaptive pooling, since output_size is not constant.. Please feel free to request support or submit a pull request on PyTorch GitHub

This is my config file of the model:

1. deploy

size = (32, 100) mean, std = 0.5, 0.5

sensitive = False character = '0123456789abcdefghijklmnopqrstuvwxyz' batch_max_length = 25

F = 20 hidden_dim = 256 norm_cfg = dict(type='BN') num_class = len(character) + 1 num_steps = batch_max_length + 1

deploy = dict( transform=[ dict(type='Sensitive', sensitive=sensitive, need_character=character), dict(type='ToGray'), dict(type='Resize', size=size), dict(type='Normalize', mean=mean, std=std), dict(type='ToTensor'), ], converter=dict( type='CTCConverter', character=character, batch_max_length=batch_max_length, ), model=dict( type='GModel', need_text=False, body=dict( type='GBody', pipelines=[ dict( type='FeatureExtractorComponent', from_layer='input', to_layer='cnn_feat', arch=dict( encoder=dict( backbone=dict( type='GResNet', layers=[ ('conv', dict(type='ConvModule', in_channels=1, out_channels=32, kernel_size=3, stride=1, padding=1, norm_cfg=norm_cfg)), ('conv', dict(type='ConvModule', in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1, norm_cfg=norm_cfg)), ('pool', dict(type='MaxPool2d', kernel_size=2, stride=2, padding=0)), ('block', dict(block_name='BasicBlock', planes=128, blocks=1, stride=1)), ('conv', dict(type='ConvModule', in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1, norm_cfg=norm_cfg)), ('pool', dict(type='MaxPool2d', kernel_size=2, stride=2, padding=0)), ('block', dict(block_name='BasicBlock', planes=256, blocks=2, stride=1)), ('conv', dict(type='ConvModule', in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1, norm_cfg=norm_cfg)), ('pool', dict(type='MaxPool2d', kernel_size=2, stride=(2, 1), padding=(0, 1))), ('block', dict(block_name='BasicBlock', planes=512, blocks=5, stride=1)), ('conv', dict(type='ConvModule', in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1, norm_cfg=norm_cfg)), ('block', dict(block_name='BasicBlock', planes=512, blocks=3, stride=1)), ('conv', dict(type='ConvModule', in_channels=512, out_channels=512, kernel_size=2, stride=(2, 1), padding=(0, 1), norm_cfg=norm_cfg)), ('conv', dict(type='ConvModule', in_channels=512, out_channels=512, kernel_size=2, stride=1, padding=0, norm_cfg=norm_cfg)), ], ), ), collect=dict(type='CollectBlock', from_layer='c4'), ), ), dict( type='SequenceEncoderComponent', from_layer='cnn_feat', to_layer='rnn_feat', arch=dict( type='RNN', input_pool=dict(type='AdaptiveAvgPool2d', output_size=(1, None)), layers=[ ('rnn', dict(type='LSTM', input_size=512, hidden_size=256, bidirectional=True, batch_first=True)), ('fc', dict(type='Linear', in_features=512, out_features=256)), ('rnn', dict(type='LSTM', input_size=256, hidden_size=256, bidirectional=True, batch_first=True)), ('fc', dict(type='Linear', in_features=512, out_features=256)), ], ), ), ], ), head=dict( type='CTCHead', from_layer='rnn_feat', num_class=num_class, in_channels=256, pool=dict( type='AdaptiveAvgPool2d', output_size=(1, None), ), ), ), )

###############################################################################

2.common

common = dict( seed=1111, logger=dict( handlers=( dict(type='StreamHandler', level='INFO'), dict(type='FileHandler', level='INFO'), ), ), cudnn_deterministic=False, cudnn_benchmark=True, metric=dict(type='Accuracy'), ) ############################################################################### dataset_params = dict( batch_max_length=batch_max_length, data_filter=True, character=character, )

test_dataset_params = dict( batch_max_length=batch_max_length, data_filter=False, character=character, )

data_root = '/home/tham/vedastr/data/data_lmdb_release/'

###############################################################################

3. test

batch_size = 64

test_root = data_root + 'evaluation/' test_folder_names = ['CAR_PLATE']

test_folder_names = ['IC03_867']

test_folder_names = ['CUTE80', 'IC03_867', 'IC13_1015', 'IC15_2077',

'IIIT5k_3000', 'SVT', 'SVTP','CAR_PLATE']

test_dataset = [dict(type='LmdbDataset', root=test_root + f_name, **test_dataset_params) for f_name in test_folder_names]

test = dict( data=dict( dataloader=dict( type='DataLoader', batch_size=batch_size, num_workers=4, shuffle=False, ), dataset=test_dataset, transform=deploy['transform'], ), postprocess_cfg=dict( sensitive=sensitive, character=character, ), )

###############################################################################

4. train

work directory

root_workdir = 'workdir'

train data

train_root = data_root + 'training/'

MJ dataset

train_root_mj = train_root + 'MJ/' mj_folder_names = ['/MJ_test', 'MJ_valid', 'MJ_train']

mj_folder_names = ['MJ_train']

ST dataset

train_root_st = train_root + 'ST/'

train_dataset_mj = [dict(type='LmdbDataset', root=train_root_mj + folder_name) for folder_name in mj_folder_names] train_dataset_st = [dict(type='LmdbDataset', root=train_root_st)]

CAR_PLATE dataset

train_root_car_plate = train_root + 'CAR_PLATE/'

train_dataset_car_plate = [dict(type='LmdbDataset', root=train_root_car_plate)]

valid

valid_root = data_root + 'validation/' valid_root = valid_root + 'CAR_PLATE/' valid_dataset = dict(type='LmdbDataset', root=valid_root, **test_dataset_params)

valid_folder_names=['CAR_PLATE','ORIGINAL']

train transforms

train_transforms = [ dict(type='Sensitive', sensitive=sensitive, need_character=character), dict(type='ToGray'), dict(type='Resize', size=size), dict(type='Normalize', mean=mean, std=std), dict(type='ToTensor'), ]

max_iterations = 300000 milestones = [100000, 200000]

train = dict( data=dict( train=dict( dataloader=dict( type='DataLoader', batch_size=batch_size, num_workers=4, ), sampler=dict( type='BalanceSampler', batch_size=batch_size, shuffle=True, oversample=True, ), dataset=dict( type='ConcatDatasets', datasets=[ dict( type='ConcatDatasets', datasets=train_dataset_mj, ), dict( type='ConcatDatasets', datasets=train_dataset_st, ), dict( type='ConcatDatasets', datasets=train_dataset_car_plate, ) ], batch_ratio=[0.475,0.475, 0.05],

batch_ratio=[1.0],

            **dataset_params,
        ),
        transform=train_transforms,
    ),
    val=dict(
        dataloader=dict(
            type='DataLoader',
            batch_size=batch_size,
            num_workers=4,
            shuffle=False,
        ),
        dataset=valid_dataset,
        transform=deploy['transform'],
    ),
),
optimizer=dict(type='Adadelta', lr=1.0, rho=0.95, eps=1e-8),
criterion=dict(type='CTCLoss', zero_infinity=True),
lr_scheduler=dict(type='StepLR',
                  iter_based=True,
                  milestones=milestones,
                  ),
max_iterations=max_iterations,
log_interval=10,
trainval_ratio=2000,
snapshot_interval=20000,
save_best=True,
resume=None,

)

choozhenbo commented 2 years ago

@hxcai is my configuration correct?