Open manisandro opened 4 years ago
Just curious: what's the advantage of having the other issues closed and collecting them in one?
Discussions in the other issues at times diverged from the original title, this is just to summarize the open issues at a glance, without having to re-read all the issues to pick out the various things.
"Better side-by-side proofreading"
I am very interested on what this means? I need a tool to do quick corrections, and I need a method to see both the original picture and the OCR-es text side by side and to have a reference into the picture where it is the cursor into the editable text. Otherwise it is very difficult to make a correction to the generated text.
@zohozer The idea is to have an extra area below the canvas where the current line, the one before and the one after are displayed, you can edit lines word-wise (and skip to the next/prev word with tab/shift+tab), and the corresponding word is highlighted in the canvas.
I think that the fastest way to make a correction is to use the left-right arrow keys to navigate to every single letter and to stop and make corrections where needed. And yes, this is very good, to have the "corresponding word is highlighted in the canvas".
If you can have a look at Abby FineReader for Windows you will find the most intuitive OCR workflow I ever seen. I want to switch to Linux and I can't find a replacement for FineReader. For this reason I am following this project, hopefully will get a little bit more polished and usable. Thank you for your hard work.
For the first iteration, I'll keep the workflow word-wise, following the hOCR document structure. Having it character-wise would mean automatically adjusting word boxes or creating/dropping word items depending on mutations to word boundaries as you type, which requires some more thought on how to robustly achieve.
Instead of a panel, the preview should be displayed in the same space that shows the hOCR tree currently. That will allow 50-50 splitting of the screen: The left half shows the original, and the right half shows the OCRed output. (The right-side panel can have two tabs: hOCR tree and Preview).
Whatever operation we do in one half should reflect in the other. For example, both panels should have the same zoom and same panning. If we click on one word in any panel, the corresponding word should be highlighted in the other. If we select a block (region) in one half, it should also be selected in the other.
Sincere apologies if this is already discussed.
I've landed a new widget for convenient proof reading (currently Qt interface only), just click on a line or a word in the output tree to activate it.
Oh, yeah! Thank you. This is a much better way to correct the OCR-ed text result. I need a build to test how it is working.
I just downloaded the latest build and I do like this new widget a lot. However may I can suggest some further improvements to the actual workflow:
An ability to continuously move the cursor between different fields using only the Left/Right cursor keys instead of using the TAB key to move between fields. Similar to the Up/Down arrow keys are moving the cursor on the upper/lower paragraph lines. So, when the cursor gets to the end of a word, if I do continue to keep the Right cursor key pressed, to jump automatically to the beginning of the next word, without the need to press the Tab key to do this action.
Possibility so select multiple words at once using the mouse cursor. If I press the Left mouse button and try to select multiple words, it is not working right now. Will be useful to be able to select multiple words at once and apply different formatting options to all of them in one go, like apply Italic or Bold to multiple words at the same time.
Up/Down arrow keys to go to the beginning of the new word, instead of selecting the entire word how it is right now.
An INSERT shortcut key. I do not have a keyboard with an INSERT key, and I need an Insert shortcut. In SublimeText I can use the CMD+Alt+o (I am using MacOS) to switch the cursor between the normal mode and the text Insert mode.
This is a nice feature!
I took a screenshot of the email from @zohozer and tried the latest feature on it.
Bugs:
I am using the latest version.
Some of the text is not detected properly.
The blue highlighted area is not detected. If I click within it, there is no response.
If I click outside it, gImageReader shows me the detected text as labels.
@zohozer
- An ability to continuously move the cursor between different fields using only the Left/Right cursor keys instead of using the TAB key to move between fields. [...]
This could be done, I'll need to check with the client.
- Possibility so select multiple words at once using the mouse cursor. [...]
This is a bit harder.
- Up/Down arrow keys to go to the beginning of the new word, instead of selecting the entire word how it is right now.
This could be done, I'll need to check with the client.
- An INSERT shortcut key. I do not have a keyboard with an INSERT key, and I need an Insert shortcut.
Can't you define such a key mapping at OS level? Would be a better place to do it rather than implement a shortcut at application level.
@raindropsfromsky
Can't you define such a key mapping at OS level? Would be a better place to do it rather than implement a shortcut at application level.
Unfortunately not. And the only software I found on Mac to support insert (overtype) mode is the SublimeText using the CMD+Alt+o shortcut. As my keyboards are missing the INSERT key, it is not possible to have any workarounds for this limitation, unfortunately. I spent a lot of time searching for other options and none of them are working.
I am attaching the image that can be used to reproduce the problem. But I think this problem has nothing to do with the image itself...
All the lines from top till middle produce tiny detected text.
Here, I selected a word from the second line, which produced tiny text:
But the very next line produces a normal font of detected text.
Here's the hOCR file and image to be OCRed: Trial.zip
It's because the font size detection logic fails miserably with the I
, which is not terribly surprising... I'll need to improve that.
@raindropsfromsky https://github.com/manisandro/gImageReader/commit/dce0871d0de6202246e323b6e95199c27db7849e should address this
If the comparacope feature is implemented, all this hard work is not necessary, as the superimposed text can be directly compared with the original. Much faster and more accurate!
comparacope
What's that?
If the comparacope feature is implemented, all this hard work is not necessary, as the superimposed text can be directly compared with the original. Much faster and more accurate!
That may be the workflow you feel is best, other people have other requirements.
comparacope
What's that?
Issue #449
If the comparacope feature is implemented, all this hard work is not necessary, as the superimposed text can be directly compared with the original. Much faster and more accurate!
That may be the workflow you feel is best, other people have other requirements. Very true.
BTW there is the problem I described here also...
BTW there is the problem I described here also...
See above:
What does the hOCR tree structure show? In case, here also please share image + hOCR file.
@manisandro Nowadays multiple releases are coming up almost on a daily basis. That's great!
I'd like to support this by testing each release rapidly and reporting issues (if any). But there is a problem: I cannot guess what's the change in each release (comparing code is tedious).
Can you please include a release note with each release (a simple text file in the "Assets")?
Thanks in advance!
@raindropsfromsky Those are automated builds, not stable releases. You should pick most of the news by looking at the git commit log, usually the messages should be indicative enough.