Open johnlockejrr opened 2 months ago
Any update?
I am looking at part of it today.
Here is an image: I didn't try to run image by image, trying now and hangs at:
(train-2.0.1-py3.11) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yaltai kraken -I BCUF_Ms._L_2057_330.jpg --suffix ".xml" segment --yolo runs/detect/train2/weights/best.pt
WARNING ⚠️ Ultralytics settings reset to default values. This may be due to a possible problem with your settings or a recent ultralytics package update.
View settings with 'yolo settings' or at '/home/incognito/.config/Ultralytics/settings.yaml'
Update settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'.
Loading ANN /home/incognito/YALTAi/train-2.0.1-py3.11/lib/python3.11/site-packages/kraken/blla.mlmodel Segmenting BCUF_Ms._L_2057_330.jpg
Since the time I replied here, the segmentation is still hanging...
can you run yaltai kraken --raise-on-error --verbose
?
Sure! I'll get back.
Now dosn't even run.
(train-2.0.1-py3.11) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yaltai kraken --raise-on-error --verbose -I BCUF_Ms._L_2057_330.jpg --suffix ".xml" segment --yolo runs/detect/train2/weights/best.pt
Can you send me your model somehow ?
Sure, I upload it now.
I does not hang for me with
yaltai kraken --alto -I ./372407844-ddb4597b-119e-46b5-88da-e6f8346c0be4.jpg -o ".xml" segment -y ./best.pt
Interesting, which YALTAi version? 2.x or 1.x? Under 1.x works for me, under 2.x hangs
Under 1.x:
+ convert datasets
+ segment
- train
Under 2.x:
- convert datasets
- segment
+ train
Just tried on another system:
(yaltai-2.0.2-py3.11) incognito@DESKTOP-NHKR7QL:~/YALTAi$ yaltai kraken --alto -I ./BCUF_Ms._L_2057_330.jpg -o ".xml" segment -y ./best.pt
WARNING ⚠️ Ultralytics settings reset to default values. This may be due to a possible problem with your settings or a recent ultralytics package update.
View settings with 'yolo settings' or at '/home/incognito/.config/Ultralytics/settings.yaml'
Update settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'.
Loading ANN /home/incognito/YALTAi/yaltai-2.0.2-py3.11/lib/python3.11/site-packages/kraken/blla.mlmodel Segmenting ./BCUF_Ms._L_2057_330.jpg
Same, hangs.
I feel like you probably have an issue with your YOLO settings, but I don't understand what it could be, Can you try recreating an environment from scratch just in case ?
I did create from scratch on both systems, on Python 10 and 11. I do have YOLOv8 installed in the systems under different envs, but I think yolo settings are kept outside the envs in /home/$USER/.config/Ultralytics/settings.yaml
can be that?
Mine:
settings_version: 0.0.4
datasets_dir: /home/incognito/datasets
weights_dir: /home/incognito/YALTAi/weights
runs_dir: /home/incognito/YALTAi/runs
uuid: 7a070aad82e209ec117c0be351714f5bda1f17e3e424dd2877f7a44c612126c4
sync: true
api_key: ''
clearml: true
comet: true
dvc: true
hub: true
mlflow: true
neptune: true
raytune: true
tensorboard: true
wandb: true
I created a new environment and I don't run into the issue. It could be, I guess, the settings.yaml ? Can you do a pip freeze ? Are you using CUDA ?
Yes, CUDA, under WSL 2 Ubuntu 22.04.5
PyTorch version: 2.1.2+cu121 (torchvision 0.16.2+cu121)
OpenCV version: 4.10.0
OS: Ubuntu 22.04.5 LTS
Python version: 3.11.10
Is CUDA available (PyTorch): Yes
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4070
Nvidia driver version: 560.94
OHHH. WSL. Kraken code base is not tested against WSL. It could be that kraken 5.X broke with your install. Can you try to use kraken as a segmenter within the same environment ? I think its 'kraken [samestuff] segment -bl` ?
If that does not work, can you try: pip install yaltai --extra-index-url https://download.pytorch.org/whl/cpu
in a new env, to not use CUDA and see what happens ?
Kraken 4.x and 5.x works flawlessly under WSL 2 I'll try both your sugestions right now.
Segmenting with kraken works in a nanosec.
EDIT:
(yaltai-2.0.2-py3.11) incognito@DESKTOP-NHKR7QL:~/YALTAi$ kraken -I BCUF_Ms._L_2057_330-bw.jpg -o ocr.txt segment
Segmenting ✓
(yaltai-2.0.2-py3.11) incognito@DESKTOP-NHKR7QL:~/YALTAi$
LAST EDIT:
On CPU works:
(yaltai-2.0.2-py3.11-cpu) incognito@DESKTOP-NHKR7QL:~/YALTAi$ time yaltai kraken --alto -I ./BCUF_Ms._L_2057_330.jpg -o ".xml" segment -y ./best.pt
Loading ANN /home/incognito/YALTAi/yaltai-2.0.2-py3.11-cpu/lib/python3.11/site-packages/kraken/blla.mlmodel Segmenting ./BCUF_Ms._L_2057_330.jpg
image 1/1 /home/incognito/YALTAi/BCUF_Ms._L_2057_330.jpg: 960x800 1 textregion, 25 textlines, 161.4ms
Speed: 3.7ms preprocess, 161.4ms inference, 0.6ms postprocess per image at shape (1, 3, 960, 800)
✓
real 0m30.502s
user 0m31.600s
sys 0m5.098s
Very strange, trainings does work on GPU and any other envs for other HTR/OCR projects work.
As far as I remember kraken for segmentation/recognition doesn't use GPU, am I wrong?
As far as I remember kraken for segmentation/recognition doesn't use GPU, am I wrong? Nope, but from what I get from your output, you are not using kraken segmentation, are you ? You are only using YOLO ? (I see textregion/textlines).
In your previous hanged example, it looks like the issue is on the side of ultralytics/yolo because you do not even get the image 1/1.
pip freeze
please ?--baseline
because it looks to me you run into bbox mode(I am signing off, it's 9 PM here, long past office hours)
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ pip freeze
aiohappyeyeballs==2.4.3
aiohttp==3.10.8
aiosignal==1.3.1
async-timeout==4.0.3
attrs==24.2.0
certifi==2024.8.30
charset-normalizer==3.3.2
click==8.1.7
contourpy==1.3.0
coremltools==6.3.0
cycler==0.12.1
fast-deskew==1.0
filelock==3.16.1
fonttools==4.54.1
frozenlist==1.4.1
fsspec==2024.9.0
idna==3.10
imageio==2.35.1
importlib_resources==6.4.5
Jinja2==3.1.4
joblib==1.4.2
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
kiwisolver==1.4.7
kraken==5.2.9
lazy_loader==0.4
lightning==2.2.5
lightning-utilities==0.11.7
lxml==5.3.0
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.2
mdurl==0.1.2
mean-average-precision==2021.4.26.0
mpmath==1.3.0
multidict==6.1.0
networkx==3.3
numpy==1.23.5
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.6.68
nvidia-nvtx-cu12==12.1.105
opencv-python==4.10.0.84
packaging==24.1
pandas==2.2.3
pillow==10.4.0
protobuf==3.20.3
psutil==6.0.0
py-cpuinfo==9.0.0
pyarrow==17.0.0
Pygments==2.18.0
pyparsing==3.1.4
python-bidi==0.4.2
python-dateutil==2.9.0.post0
pytorch-lightning==2.4.0
pytz==2024.2
PyWavelets==1.7.0
PyYAML==6.0.2
referencing==0.35.1
regex==2024.9.11
requests==2.32.3
rich==13.9.1
rpds-py==0.20.0
scikit-image==0.21.0
scikit-learn==1.2.2
scipy==1.10.1
seaborn==0.13.2
Shapely==1.8.5.post1
six==1.16.0
sympy==1.13.3
tabulate==0.8.10
thop==0.1.1.post2209072238
threadpoolctl==3.4.0
tifffile==2024.9.20
torch==2.1.2
torchmetrics==1.4.2
torchvision==0.16.2
tqdm==4.66.5
triton==2.1.0
typing_extensions==4.12.2
tzdata==2024.2
ultralytics==8.0.209
urllib3==2.2.3
YALTAi @ file:///home/incognito/YALTAi-2.0.2
yarl==1.13.1
how to run yolo in predict mode alone?
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ kraken --alto -I ./BCUF_Ms._L_2057_330-bw.jpg -o ".xml" segment --baseline
Loading ANN /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/kraken/blla.mlmodel ✓
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$
It definitely confirms that the issues either lies in the connection between YOLO and Kraken (ie my code) or YOLO itself.
Try to run yolo predict source=BCUF_Ms._L_2057_330-bw.jpg model=best.pt
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yolo predict source=BCUF_Ms._L_2057_330-bw.jpg model=runs/detect/train6/weights/best.pt
Ultralytics YOLOv8.0.209 🚀 Python-3.10.12 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce RTX 3060, 12288MiB)
Model summary (fused): 168 layers, 3006038 parameters, 0 gradients, 8.1 GFLOPs
image 1/1 /home/incognito/YALTAi/BCUF_Ms._L_2057_330-bw.jpg: 960x800 (no detections), 74.0ms
Speed: 4.0ms preprocess, 74.0ms inference, 39.2ms postprocess per image at shape (1, 3, 960, 800)
Results saved to runs/detect/predict2
💡 Learn more at https://docs.ultralytics.com/modes/predict
By the way... I'm not sure if it's a bug or not, when I segment with YOLO only it segments perfectly how I trained, the exact regions... when I segment with YALTAi using the YOLO model, it segment ok but adds more lines without regions names... any idea? I can give you an example if you want.
OHHHHHHH
image 1/1 /home/incognito/YALTAi/BCUF_Ms._L_2057_330-bw.jpg: 960x800 (no detections), 74.0ms
I am wondering if the issues is that your model does not detect anything in your picture and as a results breaks YALTAi.
Can you zip and send the image, to make sure it's not changed by any form of compression ?
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yolo predict source=3_page-0016.jpg model=runs/detect/train7/weights/best.pt
Ultralytics YOLOv8.0.209 🚀 Python-3.10.12 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce RTX 3060, 12288MiB)
Model summary (fused): 168 layers, 3006038 parameters, 0 gradients, 8.1 GFLOPs
image 1/1 /home/incognito/YALTAi/3_page-0016.jpg: 960x704 1 textzone, 9 textlines, 70.4ms
Speed: 4.1ms preprocess, 70.4ms inference, 101.3ms postprocess per image at shape (1, 3, 960, 704)
Results saved to runs/detect/predict4
💡 Learn more at https://docs.ultralytics.com/modes/predict
(yaltai-2.0.2-py3.11-cpu) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yaltai kraken --alto -I ./3_page-0016.jpg -o ".xml" segment -y runs/detect/train7/weights/best.pt
Loading ANN /home/incognito/YALTAi/yaltai-2.0.2-py3.11-cpu/lib/python3.11/site-packages/kraken/blla.mlmodel Segmenting ./3_page-0016.jpg
image 1/1 /home/incognito/YALTAi/3_page-0016.jpg: 960x704 1 textzone, 9 textlines, 166.4ms
Speed: 3.8ms preprocess, 166.4ms inference, 0.7ms postprocess per image at shape (1, 3, 960, 704)
[10/02/24 12:02:29] WARNING Polygonizer failed on line 0: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s) segmentation.py:783
✓
Wait so it's not hanging anymore ?
The YOLO model is ok, detects exactly what I need.
Wait so it's not hanging anymore ?
On CPU is not hanging.
The added lines without class names could be from blla.mlmodel
? Is that used?
The added lines without class names could be from
blla.mlmodel
? Is that used?
The point of YALTAi is to run YOLO Region with BLLA from Kraken, so, yes :)
Oh... so it adds segmentation from blla also, that's not a problem with PAGE/ALTO, you can select what you need.
For the issue we closed, is not resolved, at least for me:
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yaltai convert alto-to-yolo teyman_alto/*.xml teyman_alto_test --shuffle .1
Using list of inputs.
Found 70 to convert.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/bin/yaltai:8 in <module> │
│ │
│ 5 from yaltai.cli.yaltai import yaltai_cli │
│ 6 if __name__ == '__main__': │
│ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(yaltai_cli()) │
│ 9 │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/click/core.py:1157 in │
│ __call__ │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/click/core.py:1078 in │
│ main │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/click/core.py:1688 in │
│ invoke │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/click/core.py:1688 in │
│ invoke │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/click/core.py:1434 in │
│ invoke │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/click/core.py:783 in │
│ invoke │
│ │
│ /home/incognito/YALTAi/yaltai-2.0.2-py3.10/lib/python3.10/site-packages/yaltai/cli/yaltai.py:98 │
│ in alto_to_yolo │
│ │
│ 95 │ if val: │
│ 96 │ │ message(f"{len(val)} image for validation.", fg='green') │
│ 97 │ elif shuffle: │
│ ❱ 98 │ │ random.shuffle(input_paths) │
│ 99 │ │ val_idx = int(len(input_paths) * shuffle) │
│ 100 │ │ message(f"{val_idx+1}/{len(input_paths)} image for validation.", fg='green') │
│ 101 │
│ │
│ /usr/lib/python3.10/random.py:394 in shuffle │
│ │
│ 391 │ │ │ for i in reversed(range(1, len(x))): │
│ 392 │ │ │ │ # pick an element in x[:i+1] with which to exchange x[i] │
│ 393 │ │ │ │ j = randbelow(i + 1) │
│ ❱ 394 │ │ │ │ x[i], x[j] = x[j], x[i] │
│ 395 │ │ else: │
│ 396 │ │ │ _warn('The *random* parameter to shuffle() has been deprecated\n' │
│ 397 │ │ │ │ 'since Python 3.9 and will be removed in a subsequent ' │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: 'tuple' object does not support item assignment
Could you test with the same image ? AKA BCUF_Ms._L_2057_330-bw.jpg
and send me this specific one ?
Because to me
yolo predict source=BCUF_Ms._L_2057_330-bw.jpg model=runs/detect/train6/weights/best.pt
Looks like you had no prediction whatsoever on this specific image
image 1/1 /home/incognito/YALTAi/BCUF_Ms._L_2057_330-bw.jpg: 960x800 (no detections), 74.0ms
And this could be somehow causing issues.
Other point, if you only need YOLO preds, I would use yolo predict and yaltai convert yolo-to-alto
Ok, wait
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yolo predict source=BCUF_Ms._L_2057_330-bw.jpg model=runs/detect/train3/weights/best.pt
Ultralytics YOLOv8.0.209 🚀 Python-3.10.12 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce RTX 3060, 12288MiB)
Model summary (fused): 218 layers, 25840918 parameters, 0 gradients, 78.7 GFLOPs
image 1/1 /home/incognito/YALTAi/BCUF_Ms._L_2057_330-bw.jpg: 960x800 1 textregion, 23 textlines, 81.9ms
Speed: 5.1ms preprocess, 81.9ms inference, 117.0ms postprocess per image at shape (1, 3, 960, 800)
Results saved to runs/detect/predict5
💡 Learn more at https://docs.ultralytics.com/modes/predict
Is this the same model you sent me ? Because up there, you used a different model (model=runs/detect/train7/weights/best.pt) and (model=runs/detect/train6/weights/best.pt). If you change the params at every test, I am not gonna be able to pin point your issue.
Can't recall, let me send you this one too, I might have sent you for another language. But nonetheless, it hangs no matter what model.
I don't have access to a CUDA machine and won't for a few weeks, I can't help you more.
The last thing I would suggest is trying to run
yaltai kraken --verbose --device cuda:0 --raise-on-error ...
but that's all I can say
No problem, I can live with the CPU until then :) You helped much! The only problem is the other issue.
A side question: with the YOLO model you get 4-point boxes, isn't a way to get multipolygon points like with kraken? I normally train my YOLO models with multi-points like this (label example):
0 0.5693548387096774 0.12229190421892816 0.5693548387096774 0.34207525655644244 0.8157258064516129 0.34207525655644244 0.8157258064516129 0.12229190421892816
1 0.5754032258064516 0.13397947548460662 0.5741935483870968 0.14310148232611175 0.8100806451612903 0.14310148232611175 0.8125 0.13483466362599772 0.8112903225806452 0.12485746864310149 0.5754032258064516 0.12314709236031927 0.5754032258064516 0.13397947548460662
1 0.5754032258064516 0.1507981755986317 0.5754032258064516 0.16077537058152794 0.8125 0.16077537058152794 0.8125 0.1507981755986317 0.8125 0.14139110604332952 0.5754032258064516 0.13996579247434435 0.5754032258064516 0.1507981755986317
1 0.5754032258064516 0.16562143671607754 0.5754032258064516 0.17559863169897377 0.8125 0.17474344355758267 0.8125 0.16562143671607754 0.8125 0.1582098061573546 0.5754032258064516 0.1564994298745724 0.5754032258064516 0.16562143671607754
1 0.5754032258064516 0.18244013683010263 0.5754032258064516 0.19241733181299886 0.8125 0.19241733181299886 0.8125 0.18244013683010263 0.8125 0.17559863169897377 0.5754032258064516 0.17559863169897377 0.5754032258064516 0.18244013683010263
1 0.5741935483870968 0.19811858608893956 0.572983870967742 0.2063854047890536 0.8088709677419355 0.2080957810718358 0.8125 0.1998289623717218 0.8112903225806452 0.18985176738882553 0.5741935483870968 0.18814139110604333 0.5741935483870968 0.19811858608893956
1 0.5754032258064516 0.21493728620296465 0.5741935483870968 0.22405929304446978 0.8100806451612903 0.22405929304446978 0.8125 0.21579247434435575 0.8112903225806452 0.2072405929304447 0.5754032258064516 0.20410490307867732 0.5754032258064516 0.21493728620296465
1 0.5766129032258065 0.2314709236031927 0.5754032258064516 0.23973774230330672 0.8100806451612903 0.23973774230330672 0.8125 0.2323261117445838 0.8112903225806452 0.22234891676168758 0.5766129032258065 0.22063854047890535 0.5766129032258065 0.2314709236031927
1 0.5754032258064516 0.24743443557582667 0.5754032258064516 0.2574116305587229 0.8125 0.2574116305587229 0.8125 0.24743443557582667 0.8125 0.23888255416191562 0.5754032258064516 0.23745724059293044 0.5754032258064516 0.24743443557582667
1 0.5754032258064516 0.2631128848346636 0.5741935483870968 0.27223489167616877 0.8100806451612903 0.27223489167616877 0.8125 0.26396807297605474 0.8112903225806452 0.2557012542759407 0.5754032258064516 0.25057012542759405 0.5754032258064516 0.2631128848346636
1 0.5741935483870968 0.2799315849486887 0.5741935483870968 0.2899087799315849 0.8125 0.2899087799315849 0.8125 0.2799315849486887 0.8125 0.2708095781071836 0.5741935483870968 0.2708095781071836 0.5741935483870968 0.2799315849486887
1 0.5741935483870968 0.29646522234891676 0.572983870967742 0.3055872291904219 0.8088709677419355 0.3055872291904219 0.8112903225806452 0.2973204104903079 0.8100806451612903 0.28734321550741165 0.5741935483870968 0.2864880273660205 0.5741935483870968 0.29646522234891676
1 0.5741935483870968 0.31328392246294184 0.5741935483870968 0.3232611174458381 0.8112903225806452 0.3232611174458381 0.8112903225806452 0.31328392246294184 0.8112903225806452 0.3047320410490308 0.5741935483870968 0.3047320410490308 0.5741935483870968 0.31328392246294184
1 0.572983870967742 0.3289623717217788 0.5717741935483871 0.33808437856328394 0.8088709677419355 0.33808437856328394 0.8112903225806452 0.3298175598631699 0.8100806451612903 0.32069555302166475 0.572983870967742 0.3198403648802737 0.572983870967742 0.3289623717217788
I don't have access to a CUDA machine and won't for a few weeks, I can't help you more.
The last thing I would suggest is trying to run
yaltai kraken --verbose --device cuda:0 --raise-on-error ...
but that's all I can say
(yaltai-2.0.2-py3.10) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yaltai kraken --verbose --device cuda:0 --raise-on-error --alto -I 3_page-0021.jpg -o ".xml" segment -y runs/detect/train7/weights/best.pt
Hangs here doesn't even say it loads the blla.mlmodel
When training, should I resize the images (and convert the polygons to the new size) to x960 or the trainer takes care of that no matter what size the images have?
For yaltai convert yolo-to-alto
what is the labelmap FILE
? How can I get it from segmentation?
ping
YALTAi 2.0.1 hangs forever when segmenting
v1.0.2 works: