Closed ghost closed 5 years ago
Create repository kraken-ocr
@mittagessen, If you want you can ask Github to trasnform a forked repo to an independent repo. I think the issues and PRs will be kept.
@mittagessen Thanks to @amitdo for his help
ketos linegen
set --disable-degradation
as default.kraken-scripts
include eval.py
Hi! I'm very new to this and trying to learn. I need help for CLSTM for training purpose. I don't even know how to execute the code or train them. I try using "ketos train" and it prompts me unable to start from scratch.
I did Ketos Transcription and Ketos extract but left with training. is there any method to train without using Vagrant? Is there anyone can provide a guide on using Kraken CLSTM or Ketos for training?
Sorry for the delayed answer, I've been on vacation for a few weeks. The clstmocrtrain binary and train.sh script can also be found here and here. You will have to change the last line in train.sh to point to the clstmocrtrain binary location.
Mind, the pytorch branch is nearing completion so training with ketos train
will work in a while (i fact it does already but is largely untested and there are probably lots of bugs).
@mittagessen Thanks for the reply and sorry for the trouble. May i know what is this problem ? i google it and could not find any fix for it
The script splits off 100 lines as a test set and you've got less than 101 lines of training data. Just adjust the lines to something lower than your number of lines:
sed 100q manifest.txt > test.txt
sed 1,100d manifest.txt > train.txt
@mittagessen Thank you !!
@mittagessen tmbdev just released a new repository for "Ocropy 2.0" at https://github.com/tmbdev/ocropy2 which he stated earlier that it will include improvements to the layout & text line analysis, GPU integration, along with various improvements. What do you think?
The layout analysis is certainly interesting, although it doesn't really change issues with complex semantic layout (newspaper, manuscripts, ...), subpar performance especially on Arabic, and reading order. His implementation replaces the signal processing line seed generation in the segmenter by a pixel classification network; spreading those seeds based on boxmaps and distance as in the original segmenter. Nevertheless it is a rather simple method to solve the line separation issue I've encountered using both object detection and pixel classification (of whole lines) networks.
The new ML backend using pytorch is somewhat similar to the kraken pytorch branch that's going to be the 1.0 release. The layers are presumably somewhat different as I oriented myself on VGSL but there shouldn't be major non-quality-of-life (serialization, backward compatibility, ...) differences
PS: There is also a similar trainable layout analysis at https://github.com/dhlab-epfl/dhSegment
@mittagessen what are your thoughts on:
By-the-way, I think tmbdev have moved-on to ocropy3
I haven't used calamari but I know the Wuerzburg people and it should do what it says on the tin. Although I have a personal aversion to ensemble methods the one they've implemented doesn't "feel" as arbitrary as many others.
Seam carving works for certain kinds of texts the current segmenter fails on and can be combined with something like https://github.com/mittagessen/seg to handle even fairly convoluted layouts with marginalia, decoration and interlinear notes. Unfortunately, it fails at Arabic script with vocalization and another system extracting columns, ordering lines, etc. is still needed.
@mittagessen There is a new paper released by NVidia called Noise2Noise
, it shows a new method to clean/ de-noise images without the need of using clean ground-truth images, they train using noise.
Have a look:
https://www.youtube.com/watch?v=P0fMwA3X5KI
https://arxiv.org/pdf/1803.04189.pdf
https://news.developer.nvidia.com/ai-can-now-fix-your-grainy-photos-by-only-looking-at-grainy-photos/
@mittagessen please close this topic, I opened it a long time ago.
Peace be upon you, here are some suggestions for you @mittagessen
Create repository
kraken-ocr
For
kraken-ocr
Include:kraken
Kraken Open Source OCR Engine (main repository)kraken-models
Kraken recognition models for various languages (Beta)kraken-scripts
Scripts to automate various aspects of Krakenkraken-clstm
A small C++ implementation of LSTM networks, focused on OCRkraken-research
Research and documents on KrakenFor
kraken
create a wiki, include: Kraken Ocr Part 1: Building CLSTM https://youtu.be/ST_XrfcCpKE Kraken Ocr Part 3: Creating and transcribing the HTML file https://youtu.be/No87TADb9zQ Kraken Ocr Part 4: Training a new CLSTM model https://youtu.be/Ec9Qi7S8cvA Also mention that it uses a modified version of the clstm separate-derivsFor
kraken
add tags ofkraken
kraken-ocr
ocr-engine
machine-learning
For
kraken-scripts
include:Training
For
Training
include:pretrain.sh
!/bin/bash
set -x set -a sort -R manifest.txt > /tmp/manifest2.txt sed 1,100d /tmp/manifest2.txt > train.txt sed 100q /tmp/manifest2.txt > test.txt
train.sh
!/bin/bash
set -x set -a report_every=1000 save_every=1000 maxtrain=50000 target_height=48 dewarp=center display_every=1000 test_every=1000 nhidden=100 lrate=1e-4 save_name=arabic clstmocrtrain train.txt test.txt
For
kraken-clstm
fork the clstm separate-derivs and modify clstm.h & extras.h by changingisnan
tostd::isnan
For
kraken-research
include the pdf ofImportant New Developments in Arabographic Optical Character Recognition
also future research and recognition tests might be posted there in the future.