Closed fattynoparents closed 6 months ago
You pose a very good question.
Let me start of with saying that currently there is no method of manually adding such constraints to a model. If you have any suggestion of what this might look like I am curious to hear it. Adding it after a specific symbol would mean you also have the transcription and not just baselines?
The part of the edit is very specific, so I have hard time answering this question. It is one of the main problem we are currently facing (baseline incorrectly connecting or not connecting). The other problem being rotated text. It might be able to learn that there needs to be a gap, but this might be really specific to your data. If you are training on mostly tabular data it might be easier (not saying it is definitely possible). But if there is also normal text data mixed in I suspect it will be harder (also not saying impossible) due to most regular text lines being connected. As a start you could have a look using the eval.py
script to see what the ground truth currently looks like when converted to pixels. This can maybe help give an idea of whether or not the baseline is even separated in the ground truth (this will of course determine if it is even possible to learn that there should be separation).
Let me know what you find :)
Adding it after a specific symbol would mean you also have the transcription and not just baselines?
Yes, now I think about my initial question it seems to be rather stupid :) I should have known better now that I've been using the Loghi system for a few months, but I thought for some reason that the baseline model knows something about the contents. However, the transcription happens after the segmentation, so when the baseline model segments an image into parts it has no idea of what text there is on the page. I think therefore that the EDIT part of my question is irrelevant either.
We have a project to transcribe a huge amount of material from a library catalog. In most cases there will be an author name at the top, some additional info below, and then one or several titles below, plus some info in the margins, here is one of the simplest examples: So the data is mostly tabular in terms of the page layout. The problem is that, as I wrote in the initial post, we need to distinguish certain data inside the title part, one example would be to put a break before 8:0, and as far as I get it, if 8:0 would always have appeared in a certain part of a page, it would have been possible to train the base model to understand the pattern, but since it can appear in various parts of a page, it looks like this is an impossible thing to do?
I will also check the eval.py script and see what I get with my data, thanks for the tips!
By the way, I have noticed inconsistency in the tutorial about eval
parameters:
And when I try to run it as follows:
docker run $DOCKERGPUPARAMS --rm -it -u $(id -u ${USER}):$(id -g ${USER})
-m 32000m --shm-size 10240m -v $LAYPADIR:$LAYPADIR
-v $TRAINDIR:$TRAINDIR -v $WEIGHTSDIR:$WEIGHTSDIR $DOCKERLAYPA \
python eval.py \
-c $LAYPAMODEL \
-i $TRAINDIR
I get the following error:
Traceback (most recent call last):
File "/src/laypa/eval.py", line 298, in <module>
main(args)
File "/src/laypa/eval.py", line 104, in main
image_paths = get_file_paths(args.input, supported_image_formats, cfg.PREPROCESS.DISABLE_CHECK)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/src/laypa/utils/input_utils.py", line 90, in get_file_paths
raise TypeError("Cannot run when the input path is None")
TypeError: Cannot run when the input path is None
Oh you are right about the docs being outdated. However, the --input
should be working. This indicates that args.input
is not set, so maybe check if $TRAINDIR is correct?
Yeah it's correct, the same directory works fine for the main.py
script.
The input should be a folder with images and the page
folder with corresponding XML files, right?
Yes that is the correct type of input. I'll try and see if I can reproduce the error when running using the docker somewhere this afternoon. I have not checked if using the docker could break the eval code
Ok I have investigated this further and it appears the error was due to comments in the script, when I removed them, the error is gone. However, I then got this:
Traceback (most recent call last):
File "/src/laypa/eval.py", line 298, in <module>
main(args)
File "/src/laypa/eval.py", line 106, in main
predictor = Predictor(cfg=cfg)
^^^^^^^^^^^^^^^^^^
File "/src/laypa/run.py", line 91, in __init__
raise FileNotFoundError("Cannot do inference without weights. Specify a checkpoint file to --opts TEST.WEIGHTS")
FileNotFoundError: Cannot do inference without weights. Specify a checkpoint file to --opts TEST.WEIGHTS
So it seems this parameter should also be mentioned as mandatory.
Finally, when I add the path to the weights I get this error:
Traceback (most recent call last):
File "/src/laypa/eval.py", line 298, in <module>
main(args)
File "/src/laypa/eval.py", line 211, in main
fig_manager.window.showMaximized()
^^^^^^^^^^^^^^^^^^
AttributeError: 'FigureManagerBase' object has no attribute 'window'
Is this issue with comment something we should be looking into, or just on your machine?
I think your final issue is due to running inside a docker, and visualization using matplotlib cannot open a window to display in. I've updated the README with instruction on how to use eval with the --save
option. Could you please run:
docker run $DOCKERGPUPARAMS --rm -it -u $(id -u ${USER}):$(id -g ${USER})
-m 32000m --shm-size 10240m -v $LAYPADIR:$LAYPADIR
-v $TRAINDIR:$TRAINDIR -v $WEIGHTSDIR:$WEIGHTSDIR -v $OUTPUTDIR:$OUTPUTDIR $DOCKERLAYPA \
python eval.py \
-c $LAYPAMODEL \
-i $TRAINDIR
-o $OUTPUTDIR
--save gt
With specifying $OUTPUTDIR
This should output the gt visualization to the output directory.
Is this issue with comment something we should be looking into, or just on your machine?
I'm not sure, I had the following code that didn't work:
docker run $DOCKERGPUPARAMS --rm -it -u $(id -u ${USER}):$(id -g ${USER})
-m 32000m --shm-size 10240m -v $LAYPADIR:$LAYPADIR
-v $TRAINDIR:$TRAINDIR -v $WEIGHTSDIR:$WEIGHTSDIR $DOCKERLAYPA \
python eval.py \
-c $LAYPAMODEL \
# first commented line
# second commented line
-i $TRAINDIR
When I removed the comments, the error was gone.
The code with the --save
option worked also, thanks.
This can maybe help give an idea of whether or not the baseline is even separated in the ground truth (this will of course determine if it is even possible to learn that there should be separation).
After running eval.py
if I see baselines which I would like to train Laypa to draw, does it mean it goes to train the model to recognize them? F.ex. in the example below I talk about the lines below the {
sign.
eval.py
is purely for checking how the GT or prediction looks when drawing regions or baselines. The actual training is done using main.py
. However using eval.py
you can have a look on whether or not the baseline are separated enough in this images for example. If they are not separate baselines in the GT, than the model realistically has no change of ever predicting them as separate. Depending on what the baselines look like you may want to change the thickness of the baselines or use the square baseline option
or use the square baseline option
Could you please tell me what this option is for and how it is configured? Thanks!
The setting is called PREPROCESS.BASELINE.SQUARE_LINES
, and it cuts of the ends of baselines to make them end on a square line. Mostly useful if your line width is very high and therefore the lines are going past the actual baseline in the GT
whether or not the baseline are separated enough
How do I know if they are separated enough? All gt images have baselines that are separated in the places they have to be separated, but sometimes the space between two lines is too small, and the prediction pictures of course have them as a continuous line, like here f.ex.: gt pred
One more thing (sorry for so many questions :) - why would such an error arise when launching the Laypa training process? It only appears with PREPROCESS.BASELINE.SQUARE_LINES
set to true
.
Traceback (most recent call last):
File "/src/laypa/main.py", line 140, in <module>
main(args)
File "/src/laypa/main.py", line 128, in main
launch(
File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "/src/laypa/main.py", line 107, in setup_training
preprocess_datasets(cfg, args.train, args.val, tmp_dir)
File "/src/laypa/core/preprocess.py", line 99, in preprocess_datasets
process.run()
File "/src/laypa/datasets/preprocess.py", line 525, in run
results = list(
^^^^^
File "/opt/conda/envs/laypa/lib/python3.12/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/opt/conda/envs/laypa/lib/python3.12/multiprocessing/pool.py", line 873, in next
raise value
ValueError: /home/user/training-laypa/baseline/2024.04.15/val_input/page/k21.xml has no contours
If I set it to false
, the script runs fine, but I get a lot more warnings saying *.xml contains overlapping baseline sem_seg
Sorry, one more question - is there or will there be a possibility to set an early stopping patience value in the config?
Separated enough is hard to tell (I'm still trying to figure that out), but it seems that using non square lines the lines would touch.
The error is due to the baselines being so small that there are no contours when drawing them in. I have attempted a fix for this problem in 2.0.2 (Should be released, or at least soon). As a "fix" I just draw in a circle when the baseline is too short, instead of the square sliver that would otherwise be drawn.
I don't think detectron2 has early stopping build in, and I have not added it in. However I would consider adding it. Especially in the form of a PR ;)
Using version 2.0.2 I now get the following error, which was supposed to have been fixed:
Preprocessing: 0%| | 0/171 [00:00<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/envs/laypa/lib/python3.12/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/src/laypa/datasets/preprocess.py", line 557, in process_single_file
image_shape = self.augmentations[0].get_output_shape(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/src/laypa/datasets/augmentations.py", line 261, in get_output_shape
raise ValueError("Edge length is not set")
ValueError: Edge length is not set
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/src/laypa/main.py", line 140, in <module>
main(args)
File "/src/laypa/main.py", line 128, in main
launch(
File "/opt/conda/envs/laypa/lib/python3.12/site-packages/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "/src/laypa/main.py", line 107, in setup_training
preprocess_datasets(cfg, args.train, args.val, tmp_dir)
File "/src/laypa/core/preprocess.py", line 51, in preprocess_datasets
process.run()
File "/src/laypa/datasets/preprocess.py", line 629, in run
results = list(
^^^^^
File "/opt/conda/envs/laypa/lib/python3.12/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/opt/conda/envs/laypa/lib/python3.12/multiprocessing/pool.py", line 873, in next
raise value
ValueError: Edge length is not set
Could you please have a look at this? Has this maybe been already fixed in newer versions?
Thank you for reporting, I'll have a look to see if I can reproduce this error
Can confirm that is still an issue, I will roll out a fix today/tomorrow
Can confirm that is still an issue, I will roll out a fix today/tomorrow
Hi, any news on the fix? :)
Hi, a little more patience please. I broke something else, so training still doesn't work :sweat_smile: Updated Docker build is running. I'll let you know when it is up
Haha ok, thanks for letting me know anyway :)
and 2.0.4 is released with Stefan's fix. Hope it works for you
and 2.0.4 is released with Stefan's fix. Hope it works for you
Thanks for the update! The training process seems to be working fine now. A small question - why would the following warning arise at processing certain pictures?
WARNING [04/30 06:18:34 laypa.page_xml.xml_converter]: File /home/user/training-laypa/baseline/2024.04.15/train_input/page/2049.xml contains overlapping baseline pano
Is it safe to ignore it if it does not arise too often?
If they aren't super frequent they can be ignored. They are there to show if the GT has overlapping lines. As they are near impossible for the model to distinguish. But sometime that just is the correct ground truth. If it is happening for almost every file then you should have a look at the line thickness during preprocessing or the quality of the GT.
Thought I'd better create a separate issue for this question. When finetuning a baseline model, Is there any possibility to configure the config file so that the model put a break to a line before a specified symbol? Thing is, the material we train the model on is very variable, and sometimes there should be a break in the line even though the space between two words is rather small.
A couple of examples to illustrate what I mean.
Here Laypa would separate the words correctly: And here it would draw a continuous line, though there should be a break:
Edit: Or is it so that the baseline model is complex enough that after it has met certain symbols (in this case 8:0) in enough of the training material and seen a break in the line in that place, it will learn to put a break next time it sees this symbol?