costaricardo / tesseract-android-tools

Automatically exported from code.google.com/p/tesseract-android-tools
1 stars 0 forks source link

PSM constants are incorrect #38

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
PSM constants as defined in the ccstruct/publictypes.h are as below:

enum PageSegMode {
  PSM_OSD_ONLY,       ///< Orientation and script detection only.
  PSM_AUTO_OSD,       ///< Automatic page segmentation with orientation and
                      ///< script detection. (OSD)
  PSM_AUTO_ONLY,      ///< Automatic page segmentation, but no OSD, or OCR.
  PSM_AUTO,           ///< Fully automatic page segmentation, but no OSD.
  PSM_SINGLE_COLUMN,  ///< Assume a single column of text of variable sizes.
  PSM_SINGLE_BLOCK_VERT_TEXT,  ///< Assume a single uniform block of vertically
                               ///< aligned text.
  PSM_SINGLE_BLOCK,   ///< Assume a single uniform block of text. (Default.)
  PSM_SINGLE_LINE,    ///< Treat the image as a single text line.
  PSM_SINGLE_WORD,    ///< Treat the image as a single word.
  PSM_CIRCLE_WORD,    ///< Treat the image as a single word in a circle.
  PSM_SINGLE_CHAR,    ///< Treat the image as a single character.

  PSM_COUNT           ///< Number of enum entries.
};

The ones in the TessBaseAPI.java are as below:

    /** Fully automatic page segmentation. */
    public static final int PSM_AUTO = 0;

    /** Assume a single column of text of variable sizes. */
    public static final int PSM_SINGLE_COLUMN = 1;

    /** Assume a single uniform block of text. (Default) */
    public static final int PSM_SINGLE_BLOCK = 2;

    /** Treat the image as a single text line. */
    public static final int PSM_SINGLE_LINE = 3;

    /** Treat the image as a single word. */
    public static final int PSM_SINGLE_WORD = 4;

    /** Treat the image as a single character. */
    public static final int PSM_SINGLE_CHAR = 5;

Thus, the constant PSM_AUTO in java corresponds to PSM_OSD_ONLY in tesseract 
C++ API, and to get the effect of AUTO, you either need to use 
PSM_SINGLE_COLUMN or PSM_SINGLE_LINE from java code. This needs to be fixed.

Original issue reported on code.google.com by loni...@gmail.com on 1 Jul 2012 at 7:30

GoogleCodeExporter commented 9 years ago
Updated PageSegMode constants.

Original comment by alanv@google.com on 11 Sep 2012 at 7:50