It takes approximately (<)3 times more time to generate searchable PDF compared to an implementation in tessearct 3
Using the code similar to what is there in wiki
PageSegmentationMode psm = PageSegmentationMode.AUTO_OSD;
TessBaseAPI.SetPageSegMode(psm);
using (var pix = TessBaseAPI.SetImage(imageFilePath))
{
pix.pixDeskew(0);
TessBaseAPI.Recognize();
//ensure input name is set
TessBaseAPI.SetInputName(imageFilePath);
string tessDataPath = TessBaseAPI.GetDatapath();
using (var pdfRenderer =
new PdfRenderer(destinationPdfFilePathWithoutExt, tessDataPath,
false))
{
pdfRenderer.BeginDocument(destinationPdfFileNameWithoutExt);
pdfRenderer.AddImage(TessBaseAPI);
pdfRenderer.EndDocument();
}
}
Since this is very much apparent to the use, please let us know if we are doing anything wrong.
It takes approximately (<)3 times more time to generate searchable PDF compared to an implementation in tessearct 3 Using the code similar to what is there in wiki
Since this is very much apparent to the use, please let us know if we are doing anything wrong.