Open atuyosi opened 9 years ago
Yeah, it looks like they changed quite some stuff, especially regarding output.
It will take some time, in the meantime you can use downgrade
or downgrader
from the AUR.
OK. Thanks.
:+1:
do you know which stuff changed? perhaps I can help
@acrogenesis it looks like they added a TessRenderResult
class which is used in place of STRING
for ProcessPages
.
The changes in the following fork fixed the problem for me with the Tesseract 3.04 baseline: https://github.com/ortutay/ruby-tesseract-ocr/commit/74a4042a07da0f8bf54d06ff01a1647bbdeeac92
This also applies to MacOS and Tesseract installed via Homebrew which now defaults to 3.04.
@meh can you share your thoughts on this change
@cxhartmann the problem is the Ruby side of things expect process_page
to store its value in a STRING*
, which is not the case anymore.
With that change it's going to compile, but it's going to segfault or worse as soon as you use anything related to process_page
.
@meh Ah I see. So there is more to it. Bummer, but only if you use process_page? I'd have to guess it might be more than just that.
For now I'm reverting to Tesseract v 3.02 and that seems to be working. Now that homebrew points to 3.04 (as of Sept), I went ahead and just brew uninstalled and sucked down the old homebrew formula to do the 3.02 build for me and that seems to be working fine. https://github.com/Homebrew/homebrew/blob/master/Library/Formula/tesseract.rb (check a few revisions back)
@cxhartmann yes, and the biggest problem is getting this gem to work with both pre and post 3.04.
@meh Any word on supporting 3.04?
Haven't had the time to work on it unfortunately, it's on my endless TODO list :rage4:
+1
I just wanted to use easy_captcha_solver
ruby gem that requires tesseract-ocr
ruby gem. It installed without error but when I try to use it I see that tesseract-ocr is failing to compile.
OS :
$ cat /etc/*-release
DISTRIB_ID=ManjaroLinux
DISTRIB_RELEASE=16.08
DISTRIB_CODENAME=Ellada
DISTRIB_DESCRIPTION="Manjaro Linux"
Manjaro Linux
NAME="Manjaro Linux"
ID=manjaro
PRETTY_NAME="Manjaro Linux"
ANSI_COLOR="1;32"
HOME_URL="http://www.manjaro.org/"
SUPPORT_URL="http://www.manjaro.org/"
BUG_REPORT_URL="http://bugs.manjaro.org/"
Gem :
$ gem list tesseract-ocr -d
*** LOCAL GEMS ***
tesseract-ocr (0.1.8)
Author: meh.
Homepage: http://github.com/meh/ruby-tesseract-ocr
License: BSD
Installed at: /home/noraj/.gem/ruby/2.3.0
A wrapper library to the tesseract-ocr API.
tessaract :
$ tesseract -v
tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.6.25 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.5.1
ruby :
$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
It's not clear if I need to import tesseract
or tesseract-ocr
in my ruby ?
irb(main):002:0> require 'tesseract'
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/81b6fb2baace695a88ac35bc54fcc39bf2dc1e42.log
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders/c.rb:114:in `shared_object'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:90:in `block in build'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `instance_eval'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `build'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:54:in `singleton_inline'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:39:in `inline'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:30:in `<module:BaseAPI>'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:27:in `<module:C>'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<module:Tesseract>'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c.rb:89:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/api.rb:26:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract-ocr.rb:35:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract.rb:25:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in `rescue in require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:40:in `require'
from (irb):2
from /usr/bin/irb:11:in `<main>'
irb(main):003:0> require 'tesseract-ocr'
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/81b6fb2baace695a88ac35bc54fcc39bf2dc1e42.log
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders/c.rb:114:in `shared_object'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:90:in `block in build'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `instance_eval'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `build'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:54:in `singleton_inline'
from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:39:in `inline'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:30:in `<module:BaseAPI>'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:27:in `<module:C>'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<module:Tesseract>'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c.rb:89:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/api.rb:26:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract-ocr.rb:35:in `<top (required)>'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from (irb):3
from /usr/bin/irb:11:in `<main>'
It's also not clear if tesseract
(distribution package for exemple) is needed for the tesseract-ocr
ruby gem ?
It's even not clear if tesseract
ruby gem is needed for tesseract-ocr
ruby gem ?
HERE is a full ffi-inline
error log file.
This is still an issue:
tesseract 3.04.01 leptonica-1.74 libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.25 : libtiff 4.0.6 : zlib 1.2.8
LOCAL GEMS tesseract-ocr (0.1.8)
Error:
In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0:
/usr/include/tesseract/baseapi.h:356:8: note: initializing argument 1 of ‘void tesseract::TessBaseAPI::SetImage(Pix)’
void SetImage(Pix pix);
^~~~
/home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb: In function ‘bool process_pages(tesseract::TessBaseAPI, const char, STRING)’:
/home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:183:55: error: no matching function for call to ‘tesseract::TessBaseAPI::ProcessPages(const char&, NULL, int, STRING&)’
return api->ProcessPages(filename, NULL, 0, output);
^
In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0:
/usr/include/tesseract/baseapi.h:541:8: note: candidate: bool tesseract::TessBaseAPI::ProcessPages(const char, const char, int, tesseract::TessResultRenderer)
bool ProcessPages(const char filename, const char retry_config,
^~~~
/usr/include/tesseract/baseapi.h:541:8: note: no known conversion for argument 4 from ‘STRING’ to ‘tesseract::TessResultRenderer’
/home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb: In function ‘bool process_page(tesseract::TessBaseAPI, Pix, int, const char, STRING)’:
/home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:189:71: error: no matching function for call to ‘tesseract::TessBaseAPI::ProcessPage(Pix&, int&, const char&, NULL, int, STRING&)’
return api->ProcessPage(pix, page_index, filename, NULL, 0, output);
^
In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0:
/usr/include/tesseract/baseapi.h:556:8: note: candidate: bool tesseract::TessBaseAPI::ProcessPage(Pix, int, const char, const char, int, tesseract::TessResultRenderer)
bool ProcessPage(Pix pix, int page_index, const char filename,
^~~
/usr/include/tesseract/baseapi.h:556:8: note: no known conversion for argument 6 from ‘STRING’ to ‘tesseract::TessResultRenderer*’
I tried using this
to install a downgraded version of Tesseract on my Mac. However, I get the following error
Error: Calling Resource#sha1 is disabled!
Use Resource#sha256 instead.
/Users/maheshmesta/Library/Caches/Homebrew/Formula/tesseract.rb:123:in `block (2 levels) in
How do I rectify this issue?
@Mahesh8 Tried the same, getting nowhere so far
After fiddling around for a while I came up with a solution. I've modified the file to use sha256 and also update the broken links in the file.
Tesseract.rb
: https://gist.github.com/arcticbarra/631bf0fee3c7eacc2c8b1e7b70e3e85dbrew uninstall tesseract
brew install Tesseract.rb
This fix doesn't seem to be working anymore - is there a current workaround?
It doesn't work because it has some outdated homebrew terminology. I commented out a few lines and was able to install Tesseract 3.0.2 and make this lib work!
For anyone looking for which lines to comment out: https://gist.github.com/zachfeldman/bfc7bac4543d466e9c096d585e373fbf
thank you @zachfeldman -- with the file above I'm getting
Error: Tesseract: Calling `sha256 "digest" => :tag` in a bottle block is disabled! Use `brew style --fix` on the formula to update the style or use `sha256 tag: "digest"` instead.
I wonder where this is coming from because I don't see any such syntax in your Tesseract.rb (from gist) above
There are 2 weird download links in the script mentioned in the solution here https://github.com/meh/ruby-tesseract-ocr/issues/50#issuecomment-327005723 which I don't trust. My main concern is with the GoogleDrive link (which is now also broken as well). Therefore issue still present for me.
I'm getting a CompilationError when 'require tesseract-ocr'.
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/00ac1de4050b632b230475bd71c0dc3a7de45a89.log from /usr/lib/ruby/gems/2.2.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
full trace is here, and ffi-inline's error log
Is the latest tesseract-ocr( 3.04) supported? or any API changed?
There are similar problem bellow.
ruby on rails - Tesseract-ocr gem issue on mac os x - Stack Overflow
OS: Arch Linux gem
tesseract
ruby
Thanks in Advance.