PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
42.79k stars 7.69k forks source link

KIE 发票关键信息抽取SER任务未达预期 #10127

Closed kerry-weic closed 4 months ago

kerry-weic commented 1 year ago

看官方训练SER任务预期效果图:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/kie/README_ch.md#3-%E6%95%88%E6%9E%9C%E6%BC%94%E7%A4%BA 数据集:https://aistudio.baidu.com/aistudio/datasetdetail/125158 下载后使用PPOCRLabel进行关键信息抽取模型的标注,取用52张作为训练数据,13张作为验证数据。paddle相关包版本如下: image

训练标注数据内容(其中一条)

b1.jpg [{"transcription":"广东增","label":"OTHER","points":[[1638,387],[2004,371],[2009,482],[1643,498]],"id":4095,"linking":[]},{"transcription":"4400154130","label":"OTHER","points":[[769,447],[1425,466],[1423,559],[766,540]],"id":4096,"linking":[]},{"transcription":"12270242","label":"NO_VALUE","points":[[3092,442],[3629,442],[3629,534],[3092,534]],"id":4097,"linking":[]},{"transcription":"4400154130","label":"OTHER","points":[[3612,465],[3949,465],[3949,527],[3612,527]],"id":4098,"linking":[]},{"transcription":"12270242","label":"OTHER","points":[[3621,541],[3949,541],[3949,616],[3621,616]],"id":4099,"linking":[]},{"transcription":"开票日期:","label":"OTHER","points":[[3046,656],[3432,656],[3432,718],[3046,718]],"id":4100,"linking":[]},{"transcription":"2016年06月12日","label":"OTHER","points":[[3432,647],[3892,647],[3892,709],[3432,709]],"id":4101,"linking":[]},{"transcription":"中","label":"OTHER","points":[[26,660],[153,660],[153,727],[26,727]],"id":4102,"linking":[]},{"transcription":"名称","label":"NAME_KEY","points":[[517,820],[907,820],[907,874],[517,874]],"id":4103,"linking":[[4103,4104]]},{"transcription":"深圳市购机汇网络有限公司","label":"NAME_VALUE","points":[[1021,820],[1745,820],[1745,882],[1021,882]],"id":4104,"linking":[]},{"transcription":"6/3-028486</371/>>7137+","label":"OTHER","points":[[2599,816],[3831,806],[3831,882],[2600,891]],"id":4105,"linking":[]},{"transcription":"密","label":"OTHER","points":[[2477,838],[2516,838],[2516,882],[2477,882]],"id":4106,"linking":[]},{"transcription":"购","label":"OTHER","points":[[394,869],[438,869],[438,917],[394,917]],"id":4107,"linking":[]},{"transcription":"纳税人识别号:","label":"IDENTIFY_NUMBER_KEY","points":[[510,895],[939,905],[937,980],[508,970]],"id":4108,"linking":[[4108,4109]]},{"transcription":"<332/4845/-27148959768","label":"OTHER","points":[[2630,895],[3831,886],[3831,948],[2630,957]],"id":4110,"linking":[]},{"transcription":"440300083885931","label":"IDENTIFY_NUMBER_VALUE","points":[[1074,913],[1858,908],[1859,971],[1074,975]],"id":4109,"linking":[]},{"transcription":"买","label":"OTHER","points":[[371,971],[423,948],[453,1014],[401,1038]],"id":4111,"linking":[]},{"transcription":"码","label":"OTHER","points":[[2477,962],[2525,962],[2525,1002],[2477,1002]],"id":4112,"linking":[]},{"transcription":"/>*0497-4/<377816+5+761/--5","label":"OTHER","points":[[2608,962],[3844,948],[3844,1024],[2609,1037]],"id":4113,"linking":[]},{"transcription":"地址、电话:","label":"ADDRESS_PHONE_KEY","points":[[505,988],[939,998],[937,1073],[503,1063]],"id":4114,"linking":[[4114,4115]]},{"transcription":"深圳市龙华新区民治街道民治大道酒科技大厦A12070755-23806606","label":"ADDRESS_PHONE_VALUE","points":[[1012,1006],[2362,1002],[2363,1064],[1013,1068]],"id":4115,"linking":[]},{"transcription":"税","label":"OTHER","points":[[259,1033],[320,1033],[320,1104],[259,1104]],"id":4116,"linking":[]},{"transcription":"127<8*32/4+45<4933///8>48","label":"OTHER","points":[[2603,1037],[3844,1023],[3844,1099],[2604,1113]],"id":4117,"linking":[]},{"transcription":"方","label":"OTHER","points":[[381,1064],[438,1064],[438,1130],[381,1130]],"id":4118,"linking":[]},{"transcription":"开户行及账号:","label":"BANK_ACCOUNT_KEY","points":[[505,1085],[934,1095],[933,1171],[503,1161]],"id":4119,"linking":[[4119,4120]]},{"transcription":"中国工商银行股份有限公司深圳园岭支行4000024709200172809","label":"BANK_ACCOUNT_VALUE","points":[[1012,1104],[2428,1095],[2429,1157],[1013,1166]],"id":4120,"linking":[]},{"transcription":"税率","label":"OTHER","points":[[3301,1174],[3450,1174],[3450,1259],[3301,1259]],"id":4121,"linking":[]},{"transcription":"第二","label":"OTHER","points":[[4020,1166],[4072,1166],[4072,1281],[4020,1281]],"id":4122,"linking":[]},{"transcription":"数量","label":"OTHER","points":[[2091,1188],[2271,1188],[2271,1268],[2091,1268]],"id":4123,"linking":[]},{"transcription":"单价","label":"PRICE_KEY","points":[[2459,1183],[2643,1183],[2643,1263],[2459,1263]],"id":4124,"linking":[[4124,4125]]},{"transcription":"金额","label":"MONEY_KEY","points":[[2884,1183],[3122,1183],[3122,1252],[2884,1252]],"id":4126,"linking":[[4126,4127]]},{"transcription":"税额","label":"OTHER","points":[[3607,1179],[3866,1179],[3866,1254],[3607,1254]],"id":4128,"linking":[]},{"transcription":"货物或应税劳务、服务名称","label":"SERVER_KEY","points":[[456,1197],[1214,1197],[1214,1272],[456,1272]],"id":4129,"linking":[[4129,4132],[0,0],[0,0]]},{"transcription":"规格型号","label":"OTHER","points":[[1411,1201],[1679,1201],[1679,1263],[1411,1263]],"id":4133,"linking":[]},{"transcription":"单位","label":"OTHER","points":[[1823,1197],[1955,1197],[1955,1268],[1823,1268]],"id":4134,"linking":[]},{"transcription":"2987.18","label":"MONEY_VALUE","points":[[3068,1272],[3282,1267],[3284,1334],[3069,1339]],"id":4127,"linking":[]},{"transcription":"%21","label":"OTHER","points":[[3351,1270],[3457,1257],[3465,1327],[3360,1340]],"id":4135,"linking":[]},{"transcription":"507.82","label":"OTHER","points":[[3804,1269],[3991,1257],[3994,1324],[3807,1335]],"id":4136,"linking":[]},{"transcription":"[20","label":"OTHER","points":[[254,1281],[320,1281],[320,1392],[254,1392]],"id":4137,"linking":[]},{"transcription":"小米 红米3全网通版 时尚金色","label":"SERVER_VALUE","points":[[434,1290],[1262,1290],[1262,1352],[434,1352]],"id":4130,"linking":[]},{"transcription":"红米3","label":"OTHER","points":[[1346,1290],[1512,1290],[1512,1356],[1346,1356]],"id":4138,"linking":[]},{"transcription":"个","label":"OTHER","points":[[1885,1285],[1946,1285],[1946,1347],[1885,1347]],"id":4139,"linking":[]},{"transcription":"597.43589744","label":"PRICE_VALUE","points":[[2433,1281],[2740,1281],[2740,1343],[2433,1343]],"id":4125,"linking":[]},{"transcription":"移动联通电信4G手机 双卡双待","label":"SERVER_VALUE","points":[[430,1361],[1201,1361],[1201,1423],[430,1423]],"id":4131,"linking":[]},{"transcription":"15]","label":"OTHER","points":[[250,1378],[320,1378],[320,1489],[250,1489]],"id":4140,"linking":[]},{"transcription":"-1776.07","label":"OTHER","points":[[3051,1414],[3283,1414],[3283,1489],[3051,1489]],"id":4141,"linking":[]},{"transcription":"17%","label":"OTHER","points":[[3360,1416],[3461,1403],[3470,1473],[3369,1487]],"id":4142,"linking":[]},{"transcription":"301.93","label":"OTHER","points":[[3809,1414],[4011,1414],[4011,1476],[3809,1476]],"id":4143,"linking":[]},{"transcription":"抵扣联","label":"OTHER","points":[[4008,1409],[4088,1403],[4105,1622],[4025,1628]],"id":4144,"linking":[]},{"transcription":"折扣(59.456%)","label":"SERVER_VALUE","points":[[425,1445],[855,1445],[855,1507],[425,1507]],"id":4132,"linking":[]},{"transcription":"600","label":"OTHER","points":[[250,1489],[316,1489],[316,1649],[250,1649]],"id":4145,"linking":[]},{"transcription":"购买方扣税凭证","label":"OTHER","points":[[4023,1620],[4102,1617],[4121,2090],[4042,2093]],"id":4146,"linking":[]},{"transcription":"海南","label":"OTHER","points":[[241,1706],[320,1706],[320,1875],[241,1875]],"id":4147,"linking":[]},{"transcription":"C","label":"OTHER","points":[[53,1720],[118,1720],[118,1782],[53,1782]],"id":4148,"linking":[]},{"transcription":"华","label":"OTHER","points":[[254,1857],[311,1857],[311,1937],[254,1937]],"id":4149,"linking":[]},{"transcription":"¥205.89","label":"OTHER","points":[[3703,1871],[4021,1856],[4025,1936],[3707,1951]],"id":4150,"linking":[]},{"transcription":"¥1211.11","label":"OTHER","points":[[2950,1884],[3292,1884],[3292,1946],[2950,1946]],"id":4151,"linking":[]},{"transcription":"森","label":"OTHER","points":[[245,1915],[316,1915],[316,2008],[245,2008]],"id":4152,"linking":[]},{"transcription":"计","label":"OTHER","points":[[960,1910],[1039,1910],[1039,1985],[960,1985]],"id":4153,"linking":[]},{"transcription":"合","label":"OTHER","points":[[618,1928],[666,1928],[666,1972],[618,1972]],"id":4154,"linking":[]},{"transcription":"C","label":"OTHER","points":[[48,1959],[118,1959],[118,2016],[48,2016]],"id":4155,"linking":[]},{"transcription":"实","label":"OTHER","points":[[250,1999],[311,1999],[311,2074],[250,2074]],"id":4156,"linking":[]},{"transcription":"¥1417.00","label":"TOTAL_PRICE_VALUE","points":[[3318,1999],[3733,1989],[3735,2065],[3319,2075]],"id":4157,"linking":[]},{"transcription":"壹仟肆佰壹拾柒圆整","label":"OTHER","points":[[1411,2017],[2121,2007],[2122,2083],[1412,2092]],"id":4158,"linking":[]},{"transcription":"(小写)","label":"OTHER","points":[[3029,2012],[3244,2012],[3244,2079],[3029,2079]],"id":4159,"linking":[]},{"transcription":"价税合计(大写)","label":"TOTAL_PRICE_KEY","points":[[544,2039],[1078,2039],[1078,2114],[544,2114]],"id":4160,"linking":[[4160,4157]]},{"transcription":"卡","label":"OTHER","points":[[250,2083],[307,2083],[307,2163],[250,2163]],"id":4161,"linking":[]},{"transcription":"公","label":"OTHER","points":[[250,2149],[307,2149],[307,2234],[250,2234]],"id":4162,"linking":[]},{"transcription":"dd42981320128(00001,1956","label":"OTHER","points":[[2617,2150],[3340,2145],[3340,2207],[2617,2212]],"id":4163,"linking":[]},{"transcription":"广州晶东贸易有限公司","label":"SELLER_NAME_VALUE","points":[[990,2168],[1625,2149],[1627,2224],[992,2243]],"id":4164,"linking":[]},{"transcription":"名称","label":"SELLER_NAME_KEY","points":[[487,2180],[881,2180],[881,2248],[487,2248]],"id":4165,"linking":[[4165,4164]]},{"transcription":"销","label":"OTHER","points":[[373,2211],[425,2211],[425,2282],[373,2282]],"id":4166,"linking":[]},{"transcription":"91440101664041243T","label":"OTHER","points":[[1074,2252],[2051,2238],[2052,2313],[1075,2327]],"id":4167,"linking":[]},{"transcription":"纳税人识别号:","label":"OTHER","points":[[477,2265],[915,2255],[917,2331],[479,2341]],"id":4168,"linking":[]},{"transcription":"售","label":"OTHER","points":[[364,2313],[421,2313],[421,2393],[364,2393]],"id":4169,"linking":[]},{"transcription":"地址电话:","label":"OTHER","points":[[478,2344],[916,2344],[916,2420],[478,2420]],"id":4170,"linking":[]},{"transcription":"广州市黄埔区九龙镇九龙工业园凤凰三横路99号 66215500","label":"OTHER","points":[[990,2340],[2336,2326],[2337,2402],[991,2416]],"id":4171,"linking":[]},{"transcription":"注","label":"OTHER","points":[[2503,2389],[2542,2389],[2542,2433],[2503,2433]],"id":4172,"linking":[]},{"transcription":"方","label":"OTHER","points":[[359,2433],[412,2433],[412,2495],[359,2495]],"id":4173,"linking":[]},{"transcription":"开户行及账号:","label":"OTHER","points":[[478,2446],[907,2446],[907,2508],[478,2508]],"id":4174,"linking":[]},{"transcription":"工行北京路支行3602000919200384952","label":"OTHER","points":[[990,2438],[2112,2419],[2113,2495],[991,2513]],"id":4175,"linking":[]},{"transcription":"开票人:陈秋兼","label":"OTHER","points":[[2238,2523],[2780,2494],[2785,2587],[2243,2616]],"id":4176,"linking":[]},{"transcription":"复核:张雪","label":"OTHER","points":[[1427,2537],[1815,2511],[1821,2600],[1433,2625]],"id":4177,"linking":[]},{"transcription":"发票专用章","label":"OTHER","points":[[3372,2529],[3730,2439],[3758,2550],[3399,2641]],"id":4178,"linking":[]},{"transcription":"收款人:王梅","label":"OTHER","points":[[371,2553],[817,2534],[821,2614],[375,2634]],"id":4179,"linking":[]},{"transcription":"万","label":"OTHER","points":[[3292,2548],[3358,2548],[3358,2593],[3292,2593]],"id":4180,"linking":[]},{"transcription":"(5)","label":"OTHER","points":[[3531,2603],[3655,2591],[3662,2662],[3538,2674]],"id":4181,"linking":[]},{"transcription":"No","label":"NO_KEY","points":[[2911,426],[3037,426],[3037,552],[2911,552]],"id":4182,"linking":[[4182,4097]]}]

训练数据集配置文件如下:

Global:
  use_gpu: True
  epoch_num: &epoch_num 200
  log_smooth_window: 10
  print_batch_step: 10
  save_model_dir: ./output/ser_vi_layoutxlm_my_tax2_zh
  save_epoch_step: 2000
  # evaluation is run every 10 iterations after the 0th iteration
  eval_batch_step: [ 0, 19 ]
  cal_metric_during_train: False
  save_inference_dir:
  use_visualdl: False
  seed: 2022
  infer_img: ppstructure/docs/kie/input/zh_val_42.jpg
  d2s_train_image_shape: [3, 224, 224]
  # if you want to predict using the groundtruth ocr info,
  # you can use the following config
  # infer_img: train_data/XFUND/zh_val/val.json
  # infer_mode: False

  save_res_path: ./output/ser/my_tax2_zh/res
  kie_rec_model_dir: 
  kie_det_model_dir:

Architecture:
  model_type: kie
  algorithm: &algorithm "LayoutXLM"
  Transform:
  Backbone:
    name: LayoutXLMForSer
    pretrained: True
    checkpoints:
    # one of base or vi
    mode: vi
    num_classes: &num_classes 41

Loss:
  name: VQASerTokenLayoutLMLoss
  num_classes: *num_classes
  key: "backbone_out"

Optimizer:
  name: AdamW
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Linear
    learning_rate: 0.00005
    epochs: *epoch_num
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 0.00000

PostProcess:
  name: VQASerTokenLayoutLMPostProcess
  class_path: &class_path train_data/my_tax2/class_list.txt

Metric:
  name: VQASerTokenMetric
  main_indicator: hmean

Train:
  dataset:
    name: SimpleDataSet
    data_dir: train_data/my_tax2/train
    label_file_list: 
      - train_data/my_tax2/train.json
    ratio_list: [ 1.0 ]
    transforms:
      - DecodeImage: # load image
          img_mode: RGB
          channel_first: False
      - VQATokenLabelEncode: # Class handling label
          contains_re: False
          algorithm: *algorithm
          class_path: *class_path
          use_textline_bbox_info: &use_textline_bbox_info True
          # one of [None, "tb-yx"]
          order_method: &order_method "tb-yx"
      - VQATokenPad:
          max_seq_len: &max_seq_len 512
          return_attention_mask: True
      - VQASerTokenChunk:
          max_seq_len: *max_seq_len
      - Resize:
          size: [224,224]
      - NormalizeImage:
          scale: 1
          mean: [ 123.675, 116.28, 103.53 ]
          std: [ 58.395, 57.12, 57.375 ]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order
  loader:
    shuffle: True
    drop_last: False
    batch_size_per_card: 8
    num_workers: 4

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: train_data/my_tax2/val
    label_file_list:
      - train_data/my_tax2/val.json
    transforms:
      - DecodeImage: # load image
          img_mode: RGB
          channel_first: False
      - VQATokenLabelEncode: # Class handling label
          contains_re: False
          algorithm: *algorithm
          class_path: *class_path
          use_textline_bbox_info: *use_textline_bbox_info
          order_method: *order_method
      - VQATokenPad:
          max_seq_len: *max_seq_len
          return_attention_mask: True
      - VQASerTokenChunk:
          max_seq_len: *max_seq_len
      - Resize:
          size: [224,224]
      - NormalizeImage:
          scale: 1
          mean: [ 123.675, 116.28, 103.53 ]
          std: [ 58.395, 57.12, 57.375 ]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 8
    num_workers: 4

训练字典数据

OTHER
NAME_KEY
TOTAL_PRICE_KEY
NO_KEY
ADDRESS_PHONE_VALUE
SERVER_VALUE
IDENTIFY_NUMBER_KEY
ADDRESS_PHONE_KEY
NO_VALUE
IDENTIFY_NUMBER_VALUE
BANK_ACCOUNT_KEY
BANK_ACCOUNT_VALUE
MONEY_KEY
SERVER_KEY
PRICE_KEY
SELLER_NAME_KEY
SELLER_NAME_VALUE
MONEY_VALUE
TOTAL_PRICE_VALUE
PRICE_VALUE
NAME_VALUE

执行训练脚本如下: 1、# SER单卡训练

python3 tools/train.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_my_tax2_zh.yml

训练日志片段:

[2023/06/08 20:46:06] ppocr INFO: epoch: [197/200], global_step: 1379, lr: 0.000001, loss: 0.001159, avg_reader_cost: 3.47802 s, avg_batch_cost: 3.59953 s, avg_samples: 5.2, ips: 1.44463 samples/s, eta: 0:01:46 [2023/06/08 20:46:37] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest [2023/06/08 20:46:40] ppocr INFO: epoch: [198/200], global_step: 1380, lr: 0.000001, loss: 0.001173, avg_reader_cost: 3.37963 s, avg_batch_cost: 3.39954 s, avg_samples: 0.8, ips: 0.23533 samples/s, eta: 0:01:41 [2023/06/08 20:46:44] ppocr INFO: epoch: [198/200], global_step: 1386, lr: 0.000001, loss: 0.001252, avg_reader_cost: 0.17243 s, avg_batch_cost: 0.27371 s, avg_samples: 4.4, ips: 16.07556 samples/s, eta: 0:01:10 [2023/06/08 20:47:14] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest ^Meval model:: 0%| | 0/2 [00:00<?, ?it/s]^Meval model:: 50%|█████ | 1/2 [00:03<00:03, 3.44s/it]^Meval model:: 100%|██████████| 2/2 [00:03<00:00, 1.84s/it][2023/06/08 20:47:21] ppocr INFO: cur metric, precision: 0.9781021897810219, recall: 0.9852941176470589, hmean: 0.9816849816849818, fps: 63.16115572761192 [2023/06/08 20:47:51] ppocr INFO: save best model is to ./output/ser_vi_layoutxlm_my_tax2_zh/best_accuracy [2023/06/08 20:47:51] ppocr INFO: best metric, hmean: 0.9816849816849818, precision: 0.9781021897810219, recall: 0.9852941176470589, fps: 63.16115572761192, best_epoch: 199 [2023/06/08 20:47:52] ppocr INFO: epoch: [199/200], global_step: 1390, lr: 0.000001, loss: 0.001181, avg_reader_cost: 3.35348 s, avg_batch_cost: 3.42846 s, avg_samples: 3.2, ips: 0.93336 samples/s, eta: 0:00:50 [2023/06/08 20:47:52] ppocr INFO: epoch: [199/200], global_step: 1393, lr: 0.000001, loss: 0.001165, avg_reader_cost: 0.00004 s, avg_batch_cost: 0.04585 s, avg_samples: 2.0, ips: 43.62101 samples/s, eta: 0:00:35 [2023/06/08 20:48:22] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest [2023/06/08 20:48:29] ppocr INFO: epoch: [200/200], global_step: 1400, lr: 0.000001, loss: 0.001171, avg_reader_cost: 3.52568 s, avg_batch_cost: 3.64680 s, avg_samples: 5.2, ips: 1.42591 samples/s, eta: 0:00:00 [2023/06/08 20:48:59] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest [2023/06/08 20:48:59] ppocr INFO: best metric, hmean: 0.9816849816849818, precision: 0.9781021897810219, recall: 0.9852941176470589, fps: 63.16115572761192, best_epoch: 199

2、# GPU 评估, Global.checkpoints 为待测权重

 python3 tools/eval.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_my_tax2_zh.yml -o Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_my_tax2_zh/best_accuracy

日志如下:

W0609 10:08:48.044606 33907 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.7 W0609 10:08:48.393465 33907 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4. [2023/06/09 10:09:13] ppocr INFO: resume from ./output/ser_vi_layoutxlm_my_tax2_zh/best_accuracy [2023/06/09 10:09:13] ppocr INFO: metric in ckpt [2023/06/09 10:09:13] ppocr INFO: hmean:0.9816849816849818 [2023/06/09 10:09:13] ppocr INFO: precision:0.9781021897810219 [2023/06/09 10:09:13] ppocr INFO: recall:0.9852941176470589 [2023/06/09 10:09:13] ppocr INFO: fps:63.16115572761192 [2023/06/09 10:09:13] ppocr INFO: best_epoch:199 [2023/06/09 10:09:13] ppocr INFO: start_epoch:200 eval model:: 100%|██████████| 2/2 [00:11<00:00, 5.68s/it][2023/06/09 10:09:24] ppocr INFO: metric eval [2023/06/09 10:09:24] ppocr INFO: precision:0.9781021897810219 [2023/06/09 10:09:24] ppocr INFO: recall:0.9852941176470589 [2023/06/09 10:09:24] ppocr INFO: hmean:0.9816849816849818 [2023/06/09 10:09:24] ppocr INFO: fps:1.6228751124385037

3、测试信息抽取结果

python3 tools/infer_kie_token_ser.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_tax_zh.yml -o Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_tax_zh/best_accuracy Global.infer_img=./train_data/my_tax2/train

选取其中一张结果如下图: image

其中有五处语义实体识别未正确识别(用红色框框框住的区域) 1、编号和编号值未正确拆分出两个语义实体(标注数据是分开标记) 2、购买方名称,未正确识别 3、金额,未识别到 4、价税合计(小写)识别错误,上图识别结果为:"(小写) ¥3495.00",结果应为:"¥3495.00" 5、销售方名称,未识别到 请问我该如何训练才能达预期效果呢?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

papersuper commented 1 year ago

看官方训练SER任务预期效果图:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/kie/README_ch.md#3-%E6%95%88%E6%9E%9C%E6%BC%94%E7%A4%BA 数据集:https://aistudio.baidu.com/aistudio/datasetdetail/125158 下载后使用PPOCRLabel进行关键信息抽取模型的标注,取用52张作为训练数据,13张作为验证数据。paddle相关包版本如下: image

训练标注数据内容(其中一条)

b1.jpg [{"transcription":"广东增","label":"OTHER","points":[[1638,387],[2004,371],[2009,482],[1643,498]],"id":4095,"linking":[]},{"transcription":"4400154130","label":"OTHER","points":[[769,447],[1425,466],[1423,559],[766,540]],"id":4096,"linking":[]},{"transcription":"12270242","label":"NO_VALUE","points":[[3092,442],[3629,442],[3629,534],[3092,534]],"id":4097,"linking":[]},{"transcription":"4400154130","label":"OTHER","points":[[3612,465],[3949,465],[3949,527],[3612,527]],"id":4098,"linking":[]},{"transcription":"12270242","label":"OTHER","points":[[3621,541],[3949,541],[3949,616],[3621,616]],"id":4099,"linking":[]},{"transcription":"开票日期:","label":"OTHER","points":[[3046,656],[3432,656],[3432,718],[3046,718]],"id":4100,"linking":[]},{"transcription":"2016年06月12日","label":"OTHER","points":[[3432,647],[3892,647],[3892,709],[3432,709]],"id":4101,"linking":[]},{"transcription":"中","label":"OTHER","points":[[26,660],[153,660],[153,727],[26,727]],"id":4102,"linking":[]},{"transcription":"名称","label":"NAME_KEY","points":[[517,820],[907,820],[907,874],[517,874]],"id":4103,"linking":[[4103,4104]]},{"transcription":"深圳市购机汇网络有限公司","label":"NAME_VALUE","points":[[1021,820],[1745,820],[1745,882],[1021,882]],"id":4104,"linking":[]},{"transcription":"6/3-02848_6</_371/>>7137+","label":"OTHER","points":[[2599,816],[3831,806],[3831,882],[2600,891]],"id":4105,"linking":[]},{"transcription":"密","label":"OTHER","points":[[2477,838],[2516,838],[2516,882],[2477,882]],"id":4106,"linking":[]},{"transcription":"购","label":"OTHER","points":[[394,869],[438,869],[438,917],[394,917]],"id":4107,"linking":[]},{"transcription":"纳税人识别号:","label":"IDENTIFY_NUMBERKEY","points":[[510,895],[939,905],[937,980],[508,970]],"id":4108,"linking":[[4108,4109]]},{"transcription":"<332/4845/-2714_8959768","label":"OTHER","points":[[2630,895],[3831,886],[3831,948],[2630,957]],"id":4110,"linking":[]},{"transcription":"440300083885931","label":"IDENTIFY_NUMBER_VALUE","points":[[1074,913],[1858,908],[1859,971],[1074,975]],"id":4109,"linking":[]},{"transcription":"买","label":"OTHER","points":[[371,971],[423,948],[453,1014],[401,1038]],"id":4111,"linking":[]},{"transcription":"码","label":"OTHER","points":[[2477,962],[2525,962],[2525,1002],[2477,1002]],"id":4112,"linking":[]},{"transcription":"/>*0497-4/<377816+5+761/--5","label":"OTHER","points":[[2608,962],[3844,948],[3844,1024],[2609,1037]],"id":4113,"linking":[]},{"transcription":"地址、电话:","label":"ADDRESS_PHONE_KEY","points":[[505,988],[939,998],[937,1073],[503,1063]],"id":4114,"linking":[[4114,4115]]},{"transcription":"深圳市龙华新区民治街道民治大道酒科技大厦A12070755-23806606","label":"ADDRESS_PHONE_VALUE","points":[[1012,1006],[2362,1002],[2363,1064],[1013,1068]],"id":4115,"linking":[]},{"transcription":"税","label":"OTHER","points":[[259,1033],[320,1033],[320,1104],[259,1104]],"id":4116,"linking":[]},{"transcription":"127<8*32/4+45<4933///8>48","label":"OTHER","points":[[2603,1037],[3844,1023],[3844,1099],[2604,1113]],"id":4117,"linking":[]},{"transcription":"方","label":"OTHER","points":[[381,1064],[438,1064],[438,1130],[381,1130]],"id":4118,"linking":[]},{"transcription":"开户行及账号:","label":"BANK_ACCOUNT_KEY","points":[[505,1085],[934,1095],[933,1171],[503,1161]],"id":4119,"linking":[[4119,4120]]},{"transcription":"中国工商银行股份有限公司深圳园岭支行4000024709200172809","label":"BANK_ACCOUNT_VALUE","points":[[1012,1104],[2428,1095],[2429,1157],[1013,1166]],"id":4120,"linking":[]},{"transcription":"税率","label":"OTHER","points":[[3301,1174],[3450,1174],[3450,1259],[3301,1259]],"id":4121,"linking":[]},{"transcription":"第二","label":"OTHER","points":[[4020,1166],[4072,1166],[4072,1281],[4020,1281]],"id":4122,"linking":[]},{"transcription":"数量","label":"OTHER","points":[[2091,1188],[2271,1188],[2271,1268],[2091,1268]],"id":4123,"linking":[]},{"transcription":"单价","label":"PRICE_KEY","points":[[2459,1183],[2643,1183],[2643,1263],[2459,1263]],"id":4124,"linking":[[4124,4125]]},{"transcription":"金额","label":"MONEY_KEY","points":[[2884,1183],[3122,1183],[3122,1252],[2884,1252]],"id":4126,"linking":[[4126,4127]]},{"transcription":"税额","label":"OTHER","points":[[3607,1179],[3866,1179],[3866,1254],[3607,1254]],"id":4128,"linking":[]},{"transcription":"货物或应税劳务、服务名称","label":"SERVER_KEY","points":[[456,1197],[1214,1197],[1214,1272],[456,1272]],"id":4129,"linking":[[4129,4132],[0,0],[0,0]]},{"transcription":"规格型号","label":"OTHER","points":[[1411,1201],[1679,1201],[1679,1263],[1411,1263]],"id":4133,"linking":[]},{"transcription":"单位","label":"OTHER","points":[[1823,1197],[1955,1197],[1955,1268],[1823,1268]],"id":4134,"linking":[]},{"transcription":"2987.18","label":"MONEY_VALUE","points":[[3068,1272],[3282,1267],[3284,1334],[3069,1339]],"id":4127,"linking":[]},{"transcription":"%21","label":"OTHER","points":[[3351,1270],[3457,1257],[3465,1327],[3360,1340]],"id":4135,"linking":[]},{"transcription":"507.82","label":"OTHER","points":[[3804,1269],[3991,1257],[3994,1324],[3807,1335]],"id":4136,"linking":[]},{"transcription":"[20","label":"OTHER","points":[[254,1281],[320,1281],[320,1392],[254,1392]],"id":4137,"linking":[]},{"transcription":"小米 红米3全网通版 时尚金色","label":"SERVER_VALUE","points":[[434,1290],[1262,1290],[1262,1352],[434,1352]],"id":4130,"linking":[]},{"transcription":"红米3","label":"OTHER","points":[[1346,1290],[1512,1290],[1512,1356],[1346,1356]],"id":4138,"linking":[]},{"transcription":"个","label":"OTHER","points":[[1885,1285],[1946,1285],[1946,1347],[1885,1347]],"id":4139,"linking":[]},{"transcription":"597.43589744","label":"PRICE_VALUE","points":[[2433,1281],[2740,1281],[2740,1343],[2433,1343]],"id":4125,"linking":[]},{"transcription":"移动联通电信4G手机 双卡双待","label":"SERVER_VALUE","points":[[430,1361],[1201,1361],[1201,1423],[430,1423]],"id":4131,"linking":[]},{"transcription":"15]","label":"OTHER","points":[[250,1378],[320,1378],[320,1489],[250,1489]],"id":4140,"linking":[]},{"transcription":"-1776.07","label":"OTHER","points":[[3051,1414],[3283,1414],[3283,1489],[3051,1489]],"id":4141,"linking":[]},{"transcription":"17%","label":"OTHER","points":[[3360,1416],[3461,1403],[3470,1473],[3369,1487]],"id":4142,"linking":[]},{"transcription":"301.93","label":"OTHER","points":[[3809,1414],[4011,1414],[4011,1476],[3809,1476]],"id":4143,"linking":[]},{"transcription":"抵扣联","label":"OTHER","points":[[4008,1409],[4088,1403],[4105,1622],[4025,1628]],"id":4144,"linking":[]},{"transcription":"折扣(59.456%)","label":"SERVER_VALUE","points":[[425,1445],[855,1445],[855,1507],[425,1507]],"id":4132,"linking":[]},{"transcription":"600","label":"OTHER","points":[[250,1489],[316,1489],[316,1649],[250,1649]],"id":4145,"linking":[]},{"transcription":"购买方扣税凭证","label":"OTHER","points":[[4023,1620],[4102,1617],[4121,2090],[4042,2093]],"id":4146,"linking":[]},{"transcription":"海南","label":"OTHER","points":[[241,1706],[320,1706],[320,1875],[241,1875]],"id":4147,"linking":[]},{"transcription":"C","label":"OTHER","points":[[53,1720],[118,1720],[118,1782],[53,1782]],"id":4148,"linking":[]},{"transcription":"华","label":"OTHER","points":[[254,1857],[311,1857],[311,1937],[254,1937]],"id":4149,"linking":[]},{"transcription":"¥205.89","label":"OTHER","points":[[3703,1871],[4021,1856],[4025,1936],[3707,1951]],"id":4150,"linking":[]},{"transcription":"¥1211.11","label":"OTHER","points":[[2950,1884],[3292,1884],[3292,1946],[2950,1946]],"id":4151,"linking":[]},{"transcription":"森","label":"OTHER","points":[[245,1915],[316,1915],[316,2008],[245,2008]],"id":4152,"linking":[]},{"transcription":"计","label":"OTHER","points":[[960,1910],[1039,1910],[1039,1985],[960,1985]],"id":4153,"linking":[]},{"transcription":"合","label":"OTHER","points":[[618,1928],[666,1928],[666,1972],[618,1972]],"id":4154,"linking":[]},{"transcription":"C","label":"OTHER","points":[[48,1959],[118,1959],[118,2016],[48,2016]],"id":4155,"linking":[]},{"transcription":"实","label":"OTHER","points":[[250,1999],[311,1999],[311,2074],[250,2074]],"id":4156,"linking":[]},{"transcription":"¥1417.00","label":"TOTAL_PRICE_VALUE","points":[[3318,1999],[3733,1989],[3735,2065],[3319,2075]],"id":4157,"linking":[]},{"transcription":"壹仟肆佰壹拾柒圆整","label":"OTHER","points":[[1411,2017],[2121,2007],[2122,2083],[1412,2092]],"id":4158,"linking":[]},{"transcription":"(小写)","label":"OTHER","points":[[3029,2012],[3244,2012],[3244,2079],[3029,2079]],"id":4159,"linking":[]},{"transcription":"价税合计(大写)","label":"TOTAL_PRICE_KEY","points":[[544,2039],[1078,2039],[1078,2114],[544,2114]],"id":4160,"linking":[[4160,4157]]},{"transcription":"卡","label":"OTHER","points":[[250,2083],[307,2083],[307,2163],[250,2163]],"id":4161,"linking":[]},{"transcription":"公","label":"OTHER","points":[[250,2149],[307,2149],[307,2234],[250,2234]],"id":4162,"linking":[]},{"transcription":"dd42981320128(00001,1956","label":"OTHER","points":[[2617,2150],[3340,2145],[3340,2207],[2617,2212]],"id":4163,"linking":[]},{"transcription":"广州晶东贸易有限公司","label":"SELLER_NAME_VALUE","points":[[990,2168],[1625,2149],[1627,2224],[992,2243]],"id":4164,"linking":[]},{"transcription":"名称","label":"SELLER_NAME_KEY","points":[[487,2180],[881,2180],[881,2248],[487,2248]],"id":4165,"linking":[[4165,4164]]},{"transcription":"销","label":"OTHER","points":[[373,2211],[425,2211],[425,2282],[373,2282]],"id":4166,"linking":[]},{"transcription":"91440101664041243T","label":"OTHER","points":[[1074,2252],[2051,2238],[2052,2313],[1075,2327]],"id":4167,"linking":[]},{"transcription":"纳税人识别号:","label":"OTHER","points":[[477,2265],[915,2255],[917,2331],[479,2341]],"id":4168,"linking":[]},{"transcription":"售","label":"OTHER","points":[[364,2313],[421,2313],[421,2393],[364,2393]],"id":4169,"linking":[]},{"transcription":"地址电话:","label":"OTHER","points":[[478,2344],[916,2344],[916,2420],[478,2420]],"id":4170,"linking":[]},{"transcription":"广州市黄埔区九龙镇九龙工业园凤凰三横路99号 66215500","label":"OTHER","points":[[990,2340],[2336,2326],[2337,2402],[991,2416]],"id":4171,"linking":[]},{"transcription":"注","label":"OTHER","points":[[2503,2389],[2542,2389],[2542,2433],[2503,2433]],"id":4172,"linking":[]},{"transcription":"方","label":"OTHER","points":[[359,2433],[412,2433],[412,2495],[359,2495]],"id":4173,"linking":[]},{"transcription":"开户行及账号:","label":"OTHER","points":[[478,2446],[907,2446],[907,2508],[478,2508]],"id":4174,"linking":[]},{"transcription":"工行北京路支行3602000919200384952","label":"OTHER","points":[[990,2438],[2112,2419],[2113,2495],[991,2513]],"id":4175,"linking":[]},{"transcription":"开票人:陈秋兼","label":"OTHER","points":[[2238,2523],[2780,2494],[2785,2587],[2243,2616]],"id":4176,"linking":[]},{"transcription":"复核:张雪","label":"OTHER","points":[[1427,2537],[1815,2511],[1821,2600],[1433,2625]],"id":4177,"linking":[]},{"transcription":"发票专用章","label":"OTHER","points":[[3372,2529],[3730,2439],[3758,2550],[3399,2641]],"id":4178,"linking":[]},{"transcription":"收款人:王梅","label":"OTHER","points":[[371,2553],[817,2534],[821,2614],[375,2634]],"id":4179,"linking":[]},{"transcription":"万","label":"OTHER","points":[[3292,2548],[3358,2548],[3358,2593],[3292,2593]],"id":4180,"linking":[]},{"transcription":"(5)","label":"OTHER","points":[[3531,2603],[3655,2591],[3662,2662],[3538,2674]],"id":4181,"linking":[]},{"transcription":"No","label":"NO_KEY","points":[[2911,426],[3037,426],[3037,552],[2911,552]],"id":4182,"linking":[[4182,4097]]}]

训练数据集配置文件如下:

Global:
  use_gpu: True
  epoch_num: &epoch_num 200
  log_smooth_window: 10
  print_batch_step: 10
  save_model_dir: ./output/ser_vi_layoutxlm_my_tax2_zh
  save_epoch_step: 2000
  # evaluation is run every 10 iterations after the 0th iteration
  eval_batch_step: [ 0, 19 ]
  cal_metric_during_train: False
  save_inference_dir:
  use_visualdl: False
  seed: 2022
  infer_img: ppstructure/docs/kie/input/zh_val_42.jpg
  d2s_train_image_shape: [3, 224, 224]
  # if you want to predict using the groundtruth ocr info,
  # you can use the following config
  # infer_img: train_data/XFUND/zh_val/val.json
  # infer_mode: False

  save_res_path: ./output/ser/my_tax2_zh/res
  kie_rec_model_dir: 
  kie_det_model_dir:

Architecture:
  model_type: kie
  algorithm: &algorithm "LayoutXLM"
  Transform:
  Backbone:
    name: LayoutXLMForSer
    pretrained: True
    checkpoints:
    # one of base or vi
    mode: vi
    num_classes: &num_classes 41

Loss:
  name: VQASerTokenLayoutLMLoss
  num_classes: *num_classes
  key: "backbone_out"

Optimizer:
  name: AdamW
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Linear
    learning_rate: 0.00005
    epochs: *epoch_num
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 0.00000

PostProcess:
  name: VQASerTokenLayoutLMPostProcess
  class_path: &class_path train_data/my_tax2/class_list.txt

Metric:
  name: VQASerTokenMetric
  main_indicator: hmean

Train:
  dataset:
    name: SimpleDataSet
    data_dir: train_data/my_tax2/train
    label_file_list: 
      - train_data/my_tax2/train.json
    ratio_list: [ 1.0 ]
    transforms:
      - DecodeImage: # load image
          img_mode: RGB
          channel_first: False
      - VQATokenLabelEncode: # Class handling label
          contains_re: False
          algorithm: *algorithm
          class_path: *class_path
          use_textline_bbox_info: &use_textline_bbox_info True
          # one of [None, "tb-yx"]
          order_method: &order_method "tb-yx"
      - VQATokenPad:
          max_seq_len: &max_seq_len 512
          return_attention_mask: True
      - VQASerTokenChunk:
          max_seq_len: *max_seq_len
      - Resize:
          size: [224,224]
      - NormalizeImage:
          scale: 1
          mean: [ 123.675, 116.28, 103.53 ]
          std: [ 58.395, 57.12, 57.375 ]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order
  loader:
    shuffle: True
    drop_last: False
    batch_size_per_card: 8
    num_workers: 4

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: train_data/my_tax2/val
    label_file_list:
      - train_data/my_tax2/val.json
    transforms:
      - DecodeImage: # load image
          img_mode: RGB
          channel_first: False
      - VQATokenLabelEncode: # Class handling label
          contains_re: False
          algorithm: *algorithm
          class_path: *class_path
          use_textline_bbox_info: *use_textline_bbox_info
          order_method: *order_method
      - VQATokenPad:
          max_seq_len: *max_seq_len
          return_attention_mask: True
      - VQASerTokenChunk:
          max_seq_len: *max_seq_len
      - Resize:
          size: [224,224]
      - NormalizeImage:
          scale: 1
          mean: [ 123.675, 116.28, 103.53 ]
          std: [ 58.395, 57.12, 57.375 ]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 8
    num_workers: 4

训练字典数据

OTHER
NAME_KEY
TOTAL_PRICE_KEY
NO_KEY
ADDRESS_PHONE_VALUE
SERVER_VALUE
IDENTIFY_NUMBER_KEY
ADDRESS_PHONE_KEY
NO_VALUE
IDENTIFY_NUMBER_VALUE
BANK_ACCOUNT_KEY
BANK_ACCOUNT_VALUE
MONEY_KEY
SERVER_KEY
PRICE_KEY
SELLER_NAME_KEY
SELLER_NAME_VALUE
MONEY_VALUE
TOTAL_PRICE_VALUE
PRICE_VALUE
NAME_VALUE

执行训练脚本如下: 1、# SER单卡训练

python3 tools/train.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_my_tax2_zh.yml

训练日志片段:

[2023/06/08 20:46:06] ppocr INFO: epoch: [197/200], global_step: 1379, lr: 0.000001, loss: 0.001159, avg_reader_cost: 3.47802 s, avg_batch_cost: 3.59953 s, avg_samples: 5.2, ips: 1.44463 samples/s, eta: 0:01:46 [2023/06/08 20:46:37] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest [2023/06/08 20:46:40] ppocr INFO: epoch: [198/200], global_step: 1380, lr: 0.000001, loss: 0.001173, avg_reader_cost: 3.37963 s, avg_batch_cost: 3.39954 s, avg_samples: 0.8, ips: 0.23533 samples/s, eta: 0:01:41 [2023/06/08 20:46:44] ppocr INFO: epoch: [198/200], global_step: 1386, lr: 0.000001, loss: 0.001252, avg_reader_cost: 0.17243 s, avg_batch_cost: 0.27371 s, avg_samples: 4.4, ips: 16.07556 samples/s, eta: 0:01:10 [2023/06/08 20:47:14] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest ^Meval model:: 0%| | 0/2 [00:00<?, ?it/s]^Meval model:: 50%|█████ | 1/2 [00:03<00:03, 3.44s/it]^Meval model:: 100%|██████████| 2/2 [00:03<00:00, 1.84s/it][2023/06/08 20:47:21] ppocr INFO: cur metric, precision: 0.9781021897810219, recall: 0.9852941176470589, hmean: 0.9816849816849818, fps: 63.16115572761192 [2023/06/08 20:47:51] ppocr INFO: save best model is to ./output/ser_vi_layoutxlm_my_tax2_zh/best_accuracy [2023/06/08 20:47:51] ppocr INFO: best metric, hmean: 0.9816849816849818, precision: 0.9781021897810219, recall: 0.9852941176470589, fps: 63.16115572761192, best_epoch: 199 [2023/06/08 20:47:52] ppocr INFO: epoch: [199/200], global_step: 1390, lr: 0.000001, loss: 0.001181, avg_reader_cost: 3.35348 s, avg_batch_cost: 3.42846 s, avg_samples: 3.2, ips: 0.93336 samples/s, eta: 0:00:50 [2023/06/08 20:47:52] ppocr INFO: epoch: [199/200], global_step: 1393, lr: 0.000001, loss: 0.001165, avg_reader_cost: 0.00004 s, avg_batch_cost: 0.04585 s, avg_samples: 2.0, ips: 43.62101 samples/s, eta: 0:00:35 [2023/06/08 20:48:22] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest [2023/06/08 20:48:29] ppocr INFO: epoch: [200/200], global_step: 1400, lr: 0.000001, loss: 0.001171, avg_reader_cost: 3.52568 s, avg_batch_cost: 3.64680 s, avg_samples: 5.2, ips: 1.42591 samples/s, eta: 0:00:00 [2023/06/08 20:48:59] ppocr INFO: save model in ./output/ser_vi_layoutxlm_my_tax2_zh/latest [2023/06/08 20:48:59] ppocr INFO: best metric, hmean: 0.9816849816849818, precision: 0.9781021897810219, recall: 0.9852941176470589, fps: 63.16115572761192, best_epoch: 199

2、# GPU 评估, Global.checkpoints 为待测权重

 python3 tools/eval.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_my_tax2_zh.yml -o Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_my_tax2_zh/best_accuracy

日志如下:

W0609 10:08:48.044606 33907 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.7 W0609 10:08:48.393465 33907 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4. [2023/06/09 10:09:13] ppocr INFO: resume from ./output/ser_vi_layoutxlm_my_tax2_zh/best_accuracy [2023/06/09 10:09:13] ppocr INFO: metric in ckpt [2023/06/09 10:09:13] ppocr INFO: hmean:0.9816849816849818 [2023/06/09 10:09:13] ppocr INFO: precision:0.9781021897810219 [2023/06/09 10:09:13] ppocr INFO: recall:0.9852941176470589 [2023/06/09 10:09:13] ppocr INFO: fps:63.16115572761192 [2023/06/09 10:09:13] ppocr INFO: best_epoch:199 [2023/06/09 10:09:13] ppocr INFO: start_epoch:200 eval model:: 100%|██████████| 2/2 [00:11<00:00, 5.68s/it][2023/06/09 10:09:24] ppocr INFO: metric eval [2023/06/09 10:09:24] ppocr INFO: precision:0.9781021897810219 [2023/06/09 10:09:24] ppocr INFO: recall:0.9852941176470589 [2023/06/09 10:09:24] ppocr INFO: hmean:0.9816849816849818 [2023/06/09 10:09:24] ppocr INFO: fps:1.6228751124385037

3、测试信息抽取结果

python3 tools/infer_kie_token_ser.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_tax_zh.yml -o Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_tax_zh/best_accuracy Global.infer_img=./train_data/my_tax2/train

选取其中一张结果如下图: image

其中有五处语义实体识别未正确识别(用红色框框框住的区域) 1、编号和编号值未正确拆分出两个语义实体(标注数据是分开标记) 2、购买方名称,未正确识别 3、金额,未识别到 4、价税合计(小写)识别错误,上图识别结果为:"(小写) ¥3495.00",结果应为:"¥3495.00" 5、销售方名称,未识别到 请问我该如何训练才能达预期效果呢?

请问问题解决了吗

kerry-weic commented 1 year ago

@papersuper 感觉可以通过增加训练数据可以解决此问题,发票数据我只是拿来做测试使用

papersuper commented 1 year ago

好的,非常感谢您

xxllp commented 1 year ago

这块小样本有啥好的办法吗

papersuper commented 1 year ago

你的意思是样本数量少是吗,可以用一些数据增强吧,或者多收集一些数据

UserWangZz commented 4 months ago

该issue长时间未更新,暂将此issue关闭,如有需要可重新开启。