charlesw / tesseract-ocr-dotnet

Other
31 stars 35 forks source link

Fix for Init failure due to missing trailing slash on tessdata path #5

Closed jlewin closed 12 years ago

jlewin commented 12 years ago

Loving this library, thanks! I had it up and running within a short period of time but I spun for a few minutes with unexpected failures during the Init call. I'm not sure if this fix is really necessary or the best approach but it seemed better to add the slash if missing rather than fail the call.

charlesw commented 12 years ago

Glad you like the library, though the real credit must go to Cong Nguyen who originally created the port , and of course the tesseract-ocr team themselves :).

As for this particular fix I agree with you that we should fix the path ending if required though I would prefer it if we tested the provided path against ''Path.DirectorySeparatorChar'' and ''Path.AltDirectorySeparatorChar'', maybe going so far as adding a utility class called PathUtilities with a static IsPathTerminated method. Then appending ''Path.DirectorySeparatorChar'' if required.

Also could you updated the line endings so the diff is only the changes made. I think this is probably because I haven't used AutoCRLF and you may have however not sure.

jlewin commented 12 years ago

I struggled with the original commit because running git difftool showed the three line change in Winmerge that I'd expect while git diff showed the whole file being dropped and re-added. I had hoped that when I pushed the commit to github it would show the right thing but in the end, git diff was the reality.

Based on my experience, I'd recommend looking into enabling autocrlf as it was behind my mix up and although I've now disabled it and the problem seems to be resolved... my commits now contain Windows line endings while the rest of the content does not. I struggled with trying to bend VS to my will and insert the correct line endings but it forces CRLFs and using the normalize on save feature with either of the two listed options Mac (CR) or Unix (LF) resulted in the original problem where all lines appeared as changed.

Finally, my C++ experience is pretty limited so I'll leave the utility class idea to you but I've updated my original code to use DirectorySeparatorChar as suggested. As before, no worries if you want to tackle this with a more robust implementation, I just wanted to try my hand at a possible fix and raise the alarm on an unneeded error condition.

charlesw commented 12 years ago

Thanks again for the fix, I've since merged it into master. In regards to line ending concern I chose to no use AutoCRLF because I have a dependency on an external svn repository (tesseractOCR) which of course doesn't support AutoCRLF type functionality. I'll update the readme with a note about autocrlf requirements.