依照範例定義training data 卻還是發生root ERROR: No Images in train dataset錯誤

julian7862 commented 3 years ago

[2021/08/27 15:46:18] root INFO: Architecture : [2021/08/27 15:46:18] root INFO: Backbone : [2021/08/27 15:46:18] root INFO: model_name : large [2021/08/27 15:46:18] root INFO: name : MobileNetV3 [2021/08/27 15:46:18] root INFO: scale : 0.5 [2021/08/27 15:46:18] root INFO: Head : [2021/08/27 15:46:18] root INFO: fc_decay : 0.0004 [2021/08/27 15:46:18] root INFO: name : CTCHead [2021/08/27 15:46:18] root INFO: Neck : [2021/08/27 15:46:18] root INFO: encoder_type : rnn [2021/08/27 15:46:18] root INFO: hidden_size : 96 [2021/08/27 15:46:18] root INFO: name : SequenceEncoder [2021/08/27 15:46:18] root INFO: Transform : [2021/08/27 15:46:18] root INFO: loc_lr : 0.1 [2021/08/27 15:46:18] root INFO: model_name : small [2021/08/27 15:46:18] root INFO: name : TPS [2021/08/27 15:46:18] root INFO: num_fiducial : 20 [2021/08/27 15:46:18] root INFO: algorithm : STARNet [2021/08/27 15:46:18] root INFO: model_type : rec [2021/08/27 15:46:18] root INFO: Eval : [2021/08/27 15:46:18] root INFO: dataset : [2021/08/27 15:46:18] root INFO: data_dir : ./train_data/ [2021/08/27 15:46:18] root INFO: label_file_list : ['./train_data/train_sample.txt'] [2021/08/27 15:46:18] root INFO: name : SimpleDataSet [2021/08/27 15:46:18] root INFO: transforms : [2021/08/27 15:46:18] root INFO: DecodeImage : [2021/08/27 15:46:18] root INFO: channel_first : False [2021/08/27 15:46:18] root INFO: img_mode : BGR [2021/08/27 15:46:18] root INFO: CTCLabelEncode : None [2021/08/27 15:46:18] root INFO: RecResizeImg : [2021/08/27 15:46:18] root INFO: image_shape : [3, 32, 320] [2021/08/27 15:46:18] root INFO: KeepKeys : [2021/08/27 15:46:18] root INFO: keep_keys : ['image', 'label', 'length'] [2021/08/27 15:46:18] root INFO: loader : [2021/08/27 15:46:18] root INFO: batch_size_per_card : 256 [2021/08/27 15:46:18] root INFO: drop_last : False [2021/08/27 15:46:18] root INFO: num_workers : 4 [2021/08/27 15:46:18] root INFO: shuffle : True [2021/08/27 15:46:18] root INFO: Global : [2021/08/27 15:46:18] root INFO: cal_metric_during_train : True [2021/08/27 15:46:18] root INFO: character_dict_path : ppocr/utils/dict/chinese_cht_dict.txt [2021/08/27 15:46:18] root INFO: character_type : ch [2021/08/27 15:46:18] root INFO: checkpoints : None [2021/08/27 15:46:18] root INFO: debug : False [2021/08/27 15:46:18] root INFO: distributed : False [2021/08/27 15:46:18] root INFO: epoch_num : 72 [2021/08/27 15:46:18] root INFO: eval_batch_step : [0, 2000] [2021/08/27 15:46:18] root INFO: infer_img : doc/imgs_words_en/word_10.png [2021/08/27 15:46:18] root INFO: infer_mode : False [2021/08/27 15:46:18] root INFO: log_smooth_window : 20 [2021/08/27 15:46:18] root INFO: max_text_length : 25 [2021/08/27 15:46:18] root INFO: pretrained_model : None [2021/08/27 15:46:18] root INFO: print_batch_step : 10 [2021/08/27 15:46:18] root INFO: save_epoch_step : 3 [2021/08/27 15:46:18] root INFO: save_inference_dir : None [2021/08/27 15:46:18] root INFO: save_model_dir : ./output/rec/mv3_tps_bilstm_ctc/ [2021/08/27 15:46:18] root INFO: save_res_path : ./output/rec/predicts_mv3_tps_bilstm_ctc.txt [2021/08/27 15:46:18] root INFO: use_gpu : True [2021/08/27 15:46:18] root INFO: use_space_char : False [2021/08/27 15:46:18] root INFO: use_visualdl : False [2021/08/27 15:46:18] root INFO: Loss : [2021/08/27 15:46:18] root INFO: name : CTCLoss [2021/08/27 15:46:18] root INFO: Metric : [2021/08/27 15:46:18] root INFO: main_indicator : acc [2021/08/27 15:46:18] root INFO: name : RecMetric [2021/08/27 15:46:18] root INFO: Optimizer : [2021/08/27 15:46:18] root INFO: beta1 : 0.9 [2021/08/27 15:46:18] root INFO: beta2 : 0.999 [2021/08/27 15:46:18] root INFO: lr : [2021/08/27 15:46:18] root INFO: learning_rate : 0.0005 [2021/08/27 15:46:18] root INFO: name : Adam [2021/08/27 15:46:18] root INFO: regularizer : [2021/08/27 15:46:18] root INFO: factor : 0 [2021/08/27 15:46:18] root INFO: name : L2 [2021/08/27 15:46:18] root INFO: PostProcess : [2021/08/27 15:46:18] root INFO: name : CTCLabelDecode [2021/08/27 15:46:18] root INFO: Train : [2021/08/27 15:46:18] root INFO: dataset : [2021/08/27 15:46:18] root INFO: data_dir : ./train_data/ [2021/08/27 15:46:18] root INFO: label_file_list : ['./train_data/train_sample.txt'] [2021/08/27 15:46:18] root INFO: name : SimpleDataSet [2021/08/27 15:46:18] root INFO: transforms : [2021/08/27 15:46:18] root INFO: DecodeImage : [2021/08/27 15:46:18] root INFO: channel_first : False [2021/08/27 15:46:18] root INFO: img_mode : BGR [2021/08/27 15:46:18] root INFO: CTCLabelEncode : None [2021/08/27 15:46:18] root INFO: RecResizeImg : [2021/08/27 15:46:18] root INFO: image_shape : [3, 32, 320] [2021/08/27 15:46:18] root INFO: KeepKeys : [2021/08/27 15:46:18] root INFO: keep_keys : ['image', 'label', 'length'] [2021/08/27 15:46:18] root INFO: loader : [2021/08/27 15:46:18] root INFO: batch_size_per_card : 256 [2021/08/27 15:46:18] root INFO: drop_last : True [2021/08/27 15:46:18] root INFO: num_workers : 8 [2021/08/27 15:46:18] root INFO: shuffle : True [2021/08/27 15:46:18] root INFO: train with paddle 2.1.2 and device CUDAPlace(3) [2021/08/27 15:46:18] root INFO: Initialize indexs of datasets:['./train_data/train_sample.txt'] root ERROR: No Images in train dataset, please ensure

The images num in the train label_file_list should be larger than or equal with batch size.
The annotation file and path in the configuration file are provided normally.

由於先測試能否training只擺放一張照片跟一個label_file 如下 paddleOCR -----train_data ----------train.png ----------train_sample.txt yml參數為 name: SimpleDataSet data_dir: ./train_data/ label_file_list: ["./train_data/train_sample.txt"]

train_sample.txt裡面為 train.png \t train

找了好久都不知道錯誤在哪已經依照示範的做過好多種可能有沒有一個現在確定可以訓練的版本

littletomatodonkey commented 3 years ago

参考报错提示的第一点，数据量小于batch size，导致数据还不够一个batch

paddle-bot-old[bot] commented 2 years ago

Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复，我们将关闭这个issue/pr。若问题未解决或有后续问题，请随时重新打开（建议先拉取最新代码进行尝试），我们会继续跟进。

Rishav-hub commented 2 years ago

facing same issue

Data folder structure -:

YAML FILE -:

  use_gpu: true
  epoch_num: 5
  log_smooth_window: 2
  print_batch_step: 2
  save_model_dir: ./output/sast_r50_vd_ic15/
  save_epoch_step: 2
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [4000, 5000]
  cal_metric_during_train: False
  pretrained_model: /content/PaddleOCR/pretrain_models/ResNet50_vd_ssld_pretrained.pdparams
  checkpoints:
  save_inference_dir:
  use_visualdl: False
  infer_img:
  save_res_path: ./output/sast_r50_vd_ic15/predicts_sast.txt

Architecture:
  model_type: det
  algorithm: SAST
  Transform:
  Backbone:
    name: ResNet_SAST
    layers: 50
  Neck:
    name: SASTFPN
    with_cab: True
  Head:
    name: SASTHead

Loss:
  name: SASTLoss

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
  #  name: Cosine
    learning_rate: 0.001
  #  warmup_epoch: 0
  regularizer:
    name: 'L2'
    factor: 0

PostProcess:
  name: SASTPostProcess
  score_thresh: 0.5
  sample_pts_num: 2
  nms_thresh: 0.2
  expand_scale: 1.0
  shrink_ratio_of_width: 0.3

Metric:
  name: DetMetric
  main_indicator: hmean

Train:
  dataset:
    name: SimpleDataSet
    data_dir: /content/PaddleOCR/train_data/train
    label_file_list: ["./train_data/train_label.txt"]
    ratio_list: [0.1]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - SASTProcessTrain:
          image_shape: [512, 512]
          min_crop_side_ratio: 0.3
          min_crop_size: 24
          min_text_size: 4
          max_text_size: 512
      - KeepKeys:
          keep_keys: ['image', 'score_map', 'border_map', 'training_mask', 'tvo_map', 'tco_map'] # dataloader will return list in this order
  loader:
    shuffle: True
    drop_last: False
    batch_size_per_card: 1
    num_workers: 4

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /content/PaddleOCR/train_data/test
    label_file_list:
      - ./train_data/test_label.txt
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - DetResizeForTest:
          resize_long: 1536
      - NormalizeImage:
          scale: 1./255.
          mean: [0.485, 0.456, 0.406]
          std: [0.229, 0.224, 0.225]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: ['image', 'shape', 'polys', 'ignore_tags']
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1 # must be 1
    num_workers: 2

guoxiaoyue111111 commented 1 year ago

您好，请问您的问题解决了吗？

vtmjapandev commented 10 months ago

您好，请问您的问题解决了吗？

PaddlePaddle / PaddleOCR

依照範例定義training data 卻還是發生root ERROR: No Images in train dataset錯誤 #3835