Closed pjoshi90 closed 2 years ago
You can set all these mode through code. You don't have to place this inside the tessdata folder.
First create the engine with the constructor
/// <summary>
/// Creates a new instance of <see cref="Engine" /> with the specified <paramref name="engineMode" /> and
/// <paramref name="configFiles" />.
/// </summary>
/// <remarks>
/// <para>
/// The <paramref name="dataPath" /> parameter should point to the directory that contains the 'tessdata' folder
/// for example if your tesseract language data is installed in <c>C:\Tesseract\tessdata</c> the value of datapath
/// should
/// be <c>C:\Tesseract</c>. Note that tesseract will use the value of the <c>TESSDATA_PREFIX</c> environment
/// variable if defined,
/// effectively ignoring the value of <paramref name="dataPath" /> parameter.
/// </para>
/// </remarks>
/// <param name="dataPath">
/// The path to the parent directory that contains the 'tessdata' directory, ignored if the
/// <c>TESSDATA_PREFIX</c> environment variable is defined.
/// </param>
/// <param name="language">The <see cref="Language"/> to load</param>
/// <param name="engineMode">The <see cref="EngineMode" /> value to use when initializing the tesseract engine</param>
/// <param name="configFiles">
/// An optional sequence of tesseract configuration files to load, encoded using UTF8 without BOM
/// with Unix end of line characters you can use an advanced text editor such as Notepad++ to accomplish this.
/// </param>
/// <param name="initialOptions"></param>
/// <param name="setOnlyNonDebugVariables"></param>
/// <param name="logger">When set then logging is written to this <see cref="ILogger"/> interface</param>
public Engine(string dataPath, Language language, EngineMode engineMode = EngineMode.Default, IEnumerable<string> configFiles = null, IDictionary<string, object> initialOptions = null, bool setOnlyNonDebugVariables = false, ILogger logger = null)
{
if (logger != null)
Logger.LoggerInterface = logger;
DefaultPageSegMode = PageSegMode.Auto;
_handle = new HandleRef(this, TessApi.Native.BaseApiCreate());
Initialize(dataPath, new List<Language> {language}, engineMode, configFiles, initialOptions, setOnlyNonDebugVariables, logger);
}
After that you can set the page seg mode
/// <summary>
/// Processes the specific image.
/// </summary>
/// <remarks>
/// You can only have one result iterator open at any one time.
/// </remarks>
/// <param name="image">The image to process.</param>
/// <param name="inputName">Sets the input file's name, only needed for training or loading a uzn file.</param>
/// <param name="pageSegMode">The page layout analysis method to use.</param>
public Page Process(Pix.Image image, string inputName, PageSegMode? pageSegMode = null)
{
return Process(image, inputName, new Rect(0, 0, image.Width, image.Height), pageSegMode);
}
i want to set below hocr setting witj TessractOcr tessedit_create_hocr 1tessedit_pageseg_mode 4
you can find hocr file which we can placed inside tessdata folder hocr.txt