Closed FredrikAnderssonRV closed 2 weeks ago
Solved it! I'm using Costura.Fody, TesseractOCR is not working when the TesseractOCR.dll gets embedded. My solution was to add the TesseractOCR.dll manually to my project an set it to "copy always" under properties. Best regards Fredrik Andersson
Nice to read that you have solved it.
Hi!
I am developing a webapplication (.NetFramework 4.8) and I am in need of getting text from a pdf-image-page.
I have a strange problem with the program. I can get the code to work one or two times then I get an error at Engine creation, "The path has not a valid format.". After this I have to completely uninstall TesseractOCR from my project and reinstall to get it working for on or two times again. I might have to do the uninstall reinstall multiple times. I have traced the error, it is thrown at line 133 in Engine.cs, "DefaultPageSegMode = PageSegMode.Auto;". I can not see how this can throw a Path not valid error though?
The code that raises the error: engine = new TesseractOCR.Engine(Server.MapPath(@"~/tessdata"), TesseractOCR.Enums.Language.Swedish);
var img1 = TesseractOCR.Pix.Image.LoadFromFile(@"C:\TEMP\eurotext.png"); TesseractOCR.Page xpage = engine1.Process(img1);
I have the language files in the tessdata folder in my project. I hav tried to just use Server.MapPath("~") as path but then I can not get it to work at all.
Here are all my includes: using PdfSharp; using PdfSharp.Pdf; using PdfSharp.Pdf.Advanced; using PdfSharpTextExtractor; using System; using System.Data; using System.Data.SqlClient; using System.IO; using System.Security.Principal; using System.ServiceModel; using System.Text; using System.Text.RegularExpressions; using System.Web; using System.Web.UI; using System.Web.UI.WebControls; using TesseractOCR; using TesseractOCR.Enums; using static System.Windows.Forms.VisualStyles.VisualStyleElement;
I hope you can help me. Best regards Fredrik Andersson