stefanklut / laypa

Layout analysis to find layout elements in documents (similar to P2PaLA)
MIT License
17 stars 6 forks source link

Training both regions and baselines + different region types from eScriptorium #30

Closed fattynoparents closed 6 months ago

fattynoparents commented 7 months ago

I'm back to trying to train a segmentation model based on the one that already exists. Like I wrote in my previous issue, here's the config I use https://surfdrive.surf.nl/files/index.php/s/YA8HJuukIUKznSP?path=%2Flaypa%2Fgeneral%2Fbaseline#editor and I load MODEL.WEIGHTS there. I have a couple of questions. 1) Will both baselines and regions be trained when using this config? 2) I use eScriptorium for editing segmentation in the resulting files. It's possible to set various types of regions there (Commentary, Illustration, Main, Title etc.). However, when I train my model using the files that I got from eScriptorium and then use this model to segment a bunch of files, no types of regions are preserved, all regions are of type Text which is default. Is it possible to introduce an option of keeping various types of regions? Thanks a lot in advance.

stefanklut commented 7 months ago
  1. The model you are currently finetuning is only to detect baselines. To train a model that does regions you need a different config for example configs/segmentation/region/region_dataset_globalise.yaml. To use your own classes change the PREPROCESS.REGION.REGIONS and MODEL.SEM_SEG_HEAD.NUM_CLASSES. This is also how you may introduce your own types of regions.
  2. Assuming you using the na-pipeline.sh script from Loghi you need to set the REGIONLAYPA=1 and fill in the LAYPAREGIONMODEL and LAYPAREGIONMODELWEIGHTS. This will overwrite the type Text. However keep in mind that this means you have to have a good region model. The square regions (I'm presuming) you see now are just a bounding box around grouped text lines. More than likely the predicted regions will be more like blobs, and (depending on the quality of the model) will contain some errors

Let me know how it goes or if I can help with anything

fattynoparents commented 7 months ago

Thanks a lot for your reply! The current baseline model I use is rather good and it seems to response well to fine-tuning. Do you possibly know about similar region models? Aslo, do you think there's any benefit in using a region model at all? In our project there's a lot of info in the marginalia, but maybe one can also set various types to baselines instead of regions?

stefanklut commented 7 months ago

Hmm a different label for textlines in the marginalia (or other type of text lines) would be interesting. I'll have to look into that. However, there is currently no such option.

As far as region models go I don't think there is a good pretrained general model available. For the most part because the labels given to text regions are not universal. So far I have just been retraining from scratch, but I can see a world in which you replace the head (by changing PREPROCESS.REGION.REGIONS and MODEL.SEM_SEG_HEAD.NUM_CLASSES) and go from there. However I don't think it will be as easy or well suited for finetuning as the baselines. Still if you are interested this is one model I trained (https://surfdrive.surf.nl/files/index.php/s/YA8HJuukIUKznSP?path=%2Flaypa%2Frepublic%2Fregion)

Let me know if the model is available

fattynoparents commented 7 months ago

Thanks for the suggestions! I will have a look into this model of yours, it's available.

fattynoparents commented 7 months ago

Hi again, I'm trying to run the loghi inference script and make Laypa recognize regions as well. I use your regions model from above, and I modified the na-pipeline.sh script as follows:

When I try to run the inference of images using this updated script I don't get any regions in resulting files, I get pseudo-regions like as if I would run it without trying to detect regions. Basically I use the same code as with baseline detection, but I must be doing something wrong. The last part in the code above deals with extracting baselines, I haven't found a minion for extracting regions. I would be really grateful if you could point me out in the correct direction. Thanks!

stefanklut commented 7 months ago

It appears you have an old/wrong version of the na-pipeline script

-as_single_region true

Is incorrect, this should not be there. This is removed when running with REGIONLAYPA=1. I recommend getting the newer version of this script (which has been renamed to inference-pipeline.sh)

fattynoparents commented 7 months ago

I have updated the pipeline, but found another issue - there seems to be no tag 2.0.0 for Laypa that docker is trying to pull from the hub https://hub.docker.com/r/loghi/docker.laypa/tags . Same thing happens for the rest of the images.

rvankoert commented 7 months ago

That's my bad. We're in the middle of the next release (v2) and some things aren't working out as planned. We are working on it.

rvankoert commented 7 months ago

Feel free to try again. It should work again

fattynoparents commented 7 months ago

Thanks, I am now running the updated program. I had a different path to the regions model, so I had to add LAYPADIR="$(dirname "${LAYPAREGIONMODEL}")" inside the region detection block, otherwise I got this error:

Laypa baseline detection done
Running Laypa region detection
Traceback (most recent call last):
  File "/src/laypa/run.py", line 400, in <module>
    main(args)
  File "/src/laypa/run.py", line 374, in main
    cfg = setup_cfg(args)
          ^^^^^^^^^^^^^^^
  File "/src/laypa/core/setup.py", line 88, in setup_cfg
    cfg.merge_from_file(args.config)
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/config/config.py", line 45, in merge_from_file
    assert PathManager.isfile(cfg_filename), f"Config file '{cfg_filename}' does not exist!"
AssertionError: Config file '/home/tube/laypa/republic/region/config.yaml' does not exist!
Laypa region detection done

Anyway, now the region detection part seems to run fine. However, I still cannot see different types of regions in resulting XML files, all regions are of type Text. I didn't make any changed to the region model that you sent a link to, but I can see in the config that it does have various region types:

PREPROCESS:
  BASELINE:
    LINE_WIDTH: 5
  DISABLE_CHECK: false
  OVERWRITE: false
  REGION:
    MERGE_REGIONS:
    - resolution:Resumption,resumption,Insertion,insertion
   REGIONS:
    - marginalia
    - page-number
    - resolution
    - date
    - index
    - attendance
    - Resumption
    - resumption
    - Insertion
    - insertion

Am I missing something here that needs to be changed?

rvankoert commented 7 months ago

In the inference pipeline set recalculatereadingorder to 0

fattynoparents commented 6 months ago

So I'm now trying to train a region model from scratch using this config https://github.com/stefanklut/laypa/tree/main/configs/segmentation/region/region_dataset_globalise.yaml Here's the full file:

MODEL:
  META_ARCHITECTURE: "SemanticSegmentor"
  MODE: region
  SEM_SEG_HEAD:
    NUM_CLASSES: 8
  BACKBONE:
    NAME: "build_resnet_fpn_backbone"
    FREEZE_AT: 0
  RESUME: False
  PIXEL_MEAN: [123.675, 116.280, 103.530]
  PIXEL_STD: [58.395, 57.120, 57.375]
  RESNETS:
    OUT_FEATURES: ["res2", "res3", "res4", "res5"]
  FPN:
    IN_FEATURES: ["res2", "res3", "res4", "res5"]
  ANCHOR_GENERATOR:
    SIZES: [[32], [64], [128], [256], [512]]  # One size for each in feature map
    ASPECT_RATIOS: [[0.5, 1.0, 2.0]]  # Three aspect ratios (same for all in feature maps)
  RPN:
    IN_FEATURES: ["p2", "p3", "p4", "p5", "p6"]
    PRE_NMS_TOPK_TRAIN: 2000  # Per FPN level
    PRE_NMS_TOPK_TEST: 1000  # Per FPN level
    # Detectron1 uses 2000 proposals per-batch,
    # (See "modeling/rpn/rpn_outputs.py" for details of this legacy issue)
    # which is approximately 1000 proposals per-image since the default batch size for FPN is 2.
    POST_NMS_TOPK_TRAIN: 1000
    POST_NMS_TOPK_TEST: 1000
  ROI_HEADS:
    NUM_CLASSES: 7
    NAME: "StandardROIHeads"
    IN_FEATURES: ["p2", "p3", "p4", "p5"]
  ROI_BOX_HEAD:
    NAME: "FastRCNNConvFCHead"
    NUM_FC: 2
    POOLER_RESOLUTION: 7
  ROI_MASK_HEAD:
    NAME: "MaskRCNNConvUpsampleHead"
    NUM_CONV: 4
    POOLER_RESOLUTION: 14
  WEIGHTS:
DATASETS:
  TRAIN: ("train",)
  TEST: ("val",)
DATALOADER:
  NUM_WORKERS: 16
  FILTER_EMPTY_ANNOTATIONS: False
PREPROCESS:
  OVERWRITE: False
  DISABLE_CHECK: False

  RESIZE:
    RESIZE_MODE: "shortest_edge"
    RESIZE_SAMPLING: "choice"
    MIN_SIZE: [1024]
    MAX_SIZE: 2048
    SCALING: 0.5
  REGION:
    REGIONS:
      [
        "marginalia",
        "page-number",
        "Text",
        "Title",
        "author",
        "Commentary",
        "Main"
      ]
    MERGE_REGIONS: []
    REGION_TYPE: []
  BASELINE:
    LINE_WIDTH: 5
SOLVER:
  IMS_PER_BATCH: 8
  CHECKPOINT_PERIOD: 25000
  BASE_LR: 0.0002
  GAMMA: 0.1
  STEPS: () #(80000, 120000, 160000)
  MAX_ITER: 250000

INPUT:
  RESIZE_MODE: "shortest_edge"
  MIN_SIZE_TRAIN_SAMPLING: choice
  MIN_SIZE_TRAIN: (1024,)
  MAX_SIZE_TRAIN: 2048
  SCALING_TRAIN: 1.0

  # FIXME Have the Min size adjustable
  MIN_SIZE_TEST: 1024
  MAX_SIZE_TEST: 2048
  SCALING_TEST: -1.

  FORMAT: RGB

  GRAYSCALE:
    PROBABILITY: 0.

  BRIGHTNESS:
    PROBABILITY: 0.
    MIN_INTENSITY: 0.5
    MAX_INTENSITY: 1.5

  CONTRAST:
    PROBABILITY: 0.
    MIN_INTENSITY: 0.5
    MAX_INTENSITY: 1.5

  SATURATION:
    PROBABILITY: 0.
    MIN_INTENSITY: 0.5
    MAX_INTENSITY: 1.5

  GAUSSIAN_FILTER:
    PROBABILITY: 0.
    MIN_SIGMA: 0.5
    MAX_SIGMA: 1.5

  HORIZONTAL_FLIP:
    PROBABILITY: 0.

  VERTICAL_FLIP:
    PROBABILITY: 0.

  ELASTIC_DEFORMATION:
    PROBABILITY: 0.5
    ALPHA: 0.1
    SIGMA: 0.01

  AFFINE:
    PROBABILITY: 1.

    TRANSLATION:
      PROBABILITY: 0.5
      STANDARD_DEVIATION: 0.02

    ROTATION:
      PROBABILITY: 0.5
      KAPPA: 30.

    SHEAR:
      PROBABILITY: 0.5
      KAPPA: 20.

    SCALE:
      PROBABILITY: 0.5
      STANDARD_DEVIATION: 0.12
TEST:
  EVAL_PERIOD: 10000
  WEIGHTS:

TRAIN:
  WEIGHTS:

OUTPUT_DIR: ./output/region

SEED: 42

NAME: globalise

VERSION: 2

When running the training script I get this error:

Traceback (most recent call last):
  File "/src/laypa/main.py", line 140, in <module>
    main(args)
  File "/src/laypa/main.py", line 128, in main
    launch(
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/engine/launch.py", line 84, in launch
    main_func(*args)
  File "/src/laypa/main.py", line 107, in setup_training
    preprocess_datasets(cfg, args.train, args.val, tmp_dir)
  File "/src/laypa/core/preprocess.py", line 40, in preprocess_datasets
    process = Preprocess(cfg)
              ^^^^^^^^^^^^^^^
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/config/config.py", line 189, in wrapped
    explicit_args = _get_args_from_config(from_config_func, *args, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/config/config.py", line 245, in _get_args_from_config
    ret = from_config_func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/laypa/datasets/preprocess.py", line 143, in from_config
    "augmentations": build_augmentation(cfg, "preprocess"),
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/laypa/datasets/augmentations.py", line 1039, in build_augmentation
    sample_style = cfg.PREPROCESS.RESIZE.MIN_SIZE_SAMPLING
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/yacs/config.py", line 141, in __getattr__
    raise AttributeError(name)
AttributeError: MIN_SIZE_SAMPLING

I coudn't find any config in your github project where this attribute would have been set. What can be the problem here? Thanks a lot in advance!

EDIT: Same happens when I'm trying to fine-tune a baseline model. This seems to be a version 2.0.0 bug.

stefanklut commented 6 months ago

Thank you for the excellent bug hunting :smile:

This was a naming convention issue. The correct name was PREPROCESS.RESIZE.RESIZE_SAMPLING, but I must have used the other convention by mistake. This is indeed a problem caused by 2.0.0 since that is where we switch the preprocessing steps to also use augmentations instead of having it use a separate resize function.

I think it is too late to add this fix to 2.0.1, but it will be included in the next release

The issue was not present in the newer models that use scaling based on a percentage of the image size, instead of using a fixed resize. Therefore, I overlooked this issue in older model.

fattynoparents commented 6 months ago

Thank you for the excellent bug hunting 😄

Haha, thank you for speedy replies :) I will use an older version so far, I still have a lot to learn about the whole process!

fattynoparents commented 6 months ago

I think it is too late to add this fix to 2.0.1, but it will be included in the next release

You might have already fixed this, but just FYI - in version 2.0.1 there appears another error, both when trying to run training on baselines and regions:

[04/16 07:36:57 laypa.datasets.preprocess]: Could not find output dir (/tmp/tmp05kjuq5r/train), creating one at specified location
Preprocessing:   0%|                                         | 0/171 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/opt/conda/envs/laypa/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/src/laypa/datasets/preprocess.py", line 557, in process_single_file
    image_shape = self.augmentations[0].get_output_shape(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/laypa/datasets/augmentations.py", line 261, in get_output_shape
    raise ValueError("Edge length is not set")
ValueError: Edge length is not set
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/src/laypa/main.py", line 140, in <module>
    main(args)
  File "/src/laypa/main.py", line 128, in main
    launch(
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/engine/launch.py", line 84, in launch
    main_func(*args)
  File "/src/laypa/main.py", line 107, in setup_training
    preprocess_datasets(cfg, args.train, args.val, tmp_dir)
  File "/src/laypa/core/preprocess.py", line 51, in preprocess_datasets
    process.run()
  File "/src/laypa/datasets/preprocess.py", line 629, in run
    results = list(
              ^^^^^
  File "/opt/conda/envs/laypa/lib/python3.12/site-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/opt/conda/envs/laypa/lib/python3.12/multiprocessing/pool.py", line 873, in next
    raise value
ValueError: Edge length is not set
stefanklut commented 6 months ago

Already fixed, but thanks for reporting :+1: