dlareklami / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

[Feature request] Output orientation when run with -psd #955

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Run tesseract with the -psd option

What is the expected output? What do you see instead?
Expected: some indication of what orientation is detected.
Actual: nothing is output to the console.

What version of the product are you using? On what operating system?
3.0.2, Linux x64

Please provide any additional information below.
Could you please implement a patch like the one below from [1]:
--- ccmain/osdetect.cpp (revision 626)
+++ ccmain/osdetect.cpp (working copy)
@@ -269,6 +269,7 @@

   // Make sure the best_result is up-to-date
   int orientation = o.get_orientation();
+  printf("orientation %d\n",  orientation);
   osr->update_best_script(orientation);
   return num_blobs_evaluated;
 } 

Thanks. This will make it easier to work with scripting languages and the 
tesseract executable (I realize there's a C library as well but using that 
could sometimes be overkill)

[1] 
https://groups.google.com/forum/?fromgroups#!searchin/tesseract-ocr/%22page$20de
tection/tesseract-ocr/Ge4xPOqAoY8/Ilk5F9AUEXkJ

Original issue reported on code.google.com by reiniero...@gmail.com on 18 Jul 2013 at 3:14

GoogleCodeExporter commented 9 years ago
I am not sure what it is "-psd option", but this is wrong approach. Library 
should not print OCR information to stdout (only debug information).

If you need OSD information you have to use tesseract-ocr API[1] and 
modify/patch  tesseractmain.cpp.

[1] 
https://code.google.com/p/tesseract-ocr/wiki/APIExample#Orientation_and_script_d
etection_(OSD)_example

Original comment by zde...@gmail.com on 28 Jul 2013 at 9:22

GoogleCodeExporter commented 9 years ago
@zdenko: thanks.
About -psd: you're right, it is -psm in current 3.02.02; probably psd in 
earlier versions.

Well, what about classifying OSD output as debug output and print it to 
stderr/stdout? I understand about the API option but that wouldn't be feasible 
for bash scripts, manipulation using executables that can't call the C API etc.

Alternatively, and perhaps much more constructively, it might be better to 
include some C++ example programs in source code (and have it compiled when 
running make) that demonstrate the Tesseract API, e.g. something like the 
attached file - note: I'm not a C++ programmer at all, it doesn't even compile, 
but I hope you get the drift.

Original comment by reiniero...@gmail.com on 29 Jul 2013 at 8:22

Attachments:

GoogleCodeExporter commented 9 years ago
"Well, what about classifying OSD output as debug output and print it to 
stderr/stdout"
=> Why to pretend it is debug output? Did your bash script is able use 
tesseract-ocr library? I don't think so. I guess you use tesseract-ocr 
executable. So tesseract-ocr executable should be improved.

Best example code is tesseract-ocr executable(api/tesseractmain.cpp). It did 
everything you asked about example ;-)

Your tessinfo.cxx could be compiled easily (I try it on openSUSE 12.3):
    g++ -o tessinfo tessinfo.cxx -ltesseract

Original comment by zde...@gmail.com on 31 Jul 2013 at 9:03

GoogleCodeExporter commented 9 years ago
"Did your bash script is able use tesseract-ocr library? I don't think so. I 
guess you use tesseract-ocr executable. So tesseract-ocr executable should be 
improved." Completely right! Perhaps I misunderstood what you were saying.

Thanks a lot for the info as well as the compile (fails here with leptonica lib 
errors but I'll probably need to give it a go on a stable system).

Original comment by reiniero...@gmail.com on 31 Jul 2013 at 10:35

GoogleCodeExporter commented 9 years ago
Ok. Changing tesseract executable should not be a big issue. If there will be 
time I will have a look on it if somebody else will not send a patch ;-!

Original comment by zde...@gmail.com on 3 Aug 2013 at 1:49

GoogleCodeExporter commented 9 years ago
Hi!

I tried the patch by ogerman, quoted in the first post. It actually works, when 
running tesseract with -psm 0 the output is

Tesseract Open Source OCR Engine v3.02.03 with Leptonica
orientation 2
Error during processing.

So the orientation is shown, and that is all I need to know. Using -psm 1 works 
flawlessly, but I don't need the text produced, since I want to use this in a 
skript for xsane.

What could cause the error using -psm 0?

Original comment by hanksch...@googlemail.com on 19 Sep 2013 at 6:27

GoogleCodeExporter commented 9 years ago
fixed in r982

Original comment by zde...@gmail.com on 12 Jan 2014 at 2:47

GoogleCodeExporter commented 9 years ago
Hi!

Sorry to reopen this, but I can't get any information using the -psm 0 option 
with these changes. Using it with the tesseract executable leads no result 
whatsoever.

I tried this command:
tesseract image.png stdout --tessdata-dir /usr/share/tesseract-ocr/tessdata 
-psm 0 -l deu

With the "simple" patch from the first post (see above) I was able to elicit at 
least something like

orientation 2

Is there a way to include that function to the executable without using any API 
stuff?

Original comment by hanksch...@googlemail.com on 1 Feb 2014 at 10:50

GoogleCodeExporter commented 9 years ago
that "simple" patch is simple wrong patch ;-)
-psm 0 fixed in r1037.

BTW: -psm 2 provides also information. And I found out that:
    tesseract korean.png - -psm 0 -l kor
provides different information than
    tesseract korean.png - -psm 0

Original comment by zde...@gmail.com on 1 Feb 2014 at 1:00

GoogleCodeExporter commented 9 years ago
Hi!
With regards to the "simple" patch: Wrong or not, at least it worked ;-)

And you're right about -psm 2, even though I thought (from reading the help)

"2 = Automatic page segmentation, but no OSD, or OCR"

would not perform OSD... 

I'll try r1037 asap, thx!

Original comment by hanksch...@googlemail.com on 1 Feb 2014 at 2:54

GoogleCodeExporter commented 9 years ago
Problem was that usage psm 0 (PSM_OSD_ONLY) for  AnalyseLayout produces NULL 
PageIterator and this was the reason why there was not output...

But with -psm 2 (PSM_AUTO_ONLY) AnalyseLayout produce results... So description 
"Automatic page segmentation, but no OSD" could maybe misleading - because 
there is no information about script, but there are information about 
Orientation, WritingDirection, TextlineOrder and Deskew angle. Somebody could 
understand them as part of OSD.

Anyway not you can you psm 0 and 2 to get extra information.

Original comment by zde...@gmail.com on 1 Feb 2014 at 3:22

GoogleCodeExporter commented 9 years ago
OK, checked r1037 out, and now I get information using -psm 0.

btw: it seems that the -psm 2 information with regards to "orientation" is 
different to the one provides by -psm 0. Using -psm 2 "WritingDirection" seems 
to show the direction of the characters, orientation stays 0, regardless of the 
orientation I scan a page. 

Original comment by hanksch...@googlemail.com on 1 Feb 2014 at 6:14