Closed ghost closed 6 years ago
Hey @christophered, I'm refactoring the code in order to:
Now, the code does not extract lines. It only extracts characters, but with a small modification, it could guess the lines where characters are.
So, it's in English, and I'm writing a tutorial to find lines in images.
@clemsciences thanks
What are actually your needs?
I'm just testing various text line extraction
methods & tools that are available, to see which is more suitable to extract text lines from the scanned data I have. Later-on, the extracted lines will be used to train a new ocr model, either in Tesseract, ocropy or kraken.
Currently, Ocropus3 got my interest, it uses deep learning to conduct page layout analysis & segmentation.
So, I think I found what I needed already.
Thank you
Ok! Well this repository is made for fun, so you won't get state-of-the-art algorithms.
At least your interest for this domain has made me work on this and it's good.
@clemsciences Thank you for your hard work!
I have tested the seam carving
and it seems it's strength is in Handwriting line segmentation, while having some weakness in printed documents.
Thanks again.
@clemsciences how can your code be used? also can it extract lines?