meh / ruby-tesseract-ocr

A Ruby wrapper library to the tesseract-ocr API.
629 stars 74 forks source link

CompilationError with tesseract-ocr 3.04 #50

Open atuyosi opened 9 years ago

atuyosi commented 9 years ago

I'm getting a CompilationError when 'require tesseract-ocr'.

CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/00ac1de4050b632b230475bd71c0dc3a7de45a89.log from /usr/lib/ruby/gems/2.2.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'

full trace is here, and ffi-inline's error log

Is the latest tesseract-ocr( 3.04) supported? or any API changed?

There are similar problem bellow.

ruby on rails - Tesseract-ocr gem issue on mac os x - Stack Overflow

OS: Arch Linux gem

$ gem list tesseract-ocr -d

*** LOCAL GEMS ***

tesseract-ocr (0.1.8)
    Author: meh.
    Homepage: http://github.com/meh/ruby-tesseract-ocr
    License: BSD
    Installed at: /usr/lib/ruby/gems/2.2.0

    A wrapper library to the tesseract-ocr API.

tesseract

$ tesseract -v
tesseract 3.04.00
 leptonica-1.71
  libgif 5.1.0 : libjpeg 8d : libpng 1.6.18 : libtiff 4.0.4 : zlib 1.2.8 : libwebp 0.4.3

ruby

$ ruby -v
ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-linux]

Thanks in Advance.

meh commented 9 years ago

Yeah, it looks like they changed quite some stuff, especially regarding output.

It will take some time, in the meantime you can use downgrade or downgrader from the AUR.

atuyosi commented 9 years ago

OK. Thanks.

McRip commented 9 years ago

:+1:

acrogenesis commented 9 years ago

do you know which stuff changed? perhaps I can help

meh commented 9 years ago

@acrogenesis it looks like they added a TessRenderResult class which is used in place of STRING for ProcessPages.

cxhartmann commented 9 years ago

The changes in the following fork fixed the problem for me with the Tesseract 3.04 baseline: https://github.com/ortutay/ruby-tesseract-ocr/commit/74a4042a07da0f8bf54d06ff01a1647bbdeeac92

This also applies to MacOS and Tesseract installed via Homebrew which now defaults to 3.04.

@meh can you share your thoughts on this change

meh commented 9 years ago

@cxhartmann the problem is the Ruby side of things expect process_page to store its value in a STRING*, which is not the case anymore.

With that change it's going to compile, but it's going to segfault or worse as soon as you use anything related to process_page.

cxhartmann commented 9 years ago

@meh Ah I see. So there is more to it. Bummer, but only if you use process_page? I'd have to guess it might be more than just that.

For now I'm reverting to Tesseract v 3.02 and that seems to be working. Now that homebrew points to 3.04 (as of Sept), I went ahead and just brew uninstalled and sucked down the old homebrew formula to do the 3.02 build for me and that seems to be working fine. https://github.com/Homebrew/homebrew/blob/master/Library/Formula/tesseract.rb (check a few revisions back)

meh commented 9 years ago

@cxhartmann yes, and the biggest problem is getting this gem to work with both pre and post 3.04.

tpendragon commented 8 years ago

@meh Any word on supporting 3.04?

meh commented 8 years ago

Haven't had the time to work on it unfortunately, it's on my endless TODO list :rage4:

alexhanh commented 8 years ago

+1

noraj commented 8 years ago

I just wanted to use easy_captcha_solver ruby gem that requires tesseract-ocr ruby gem. It installed without error but when I try to use it I see that tesseract-ocr is failing to compile.

OS :

$ cat /etc/*-release                                                                                                                                                                                                                        
DISTRIB_ID=ManjaroLinux
DISTRIB_RELEASE=16.08
DISTRIB_CODENAME=Ellada
DISTRIB_DESCRIPTION="Manjaro Linux"
Manjaro Linux
NAME="Manjaro Linux"
ID=manjaro
PRETTY_NAME="Manjaro Linux"
ANSI_COLOR="1;32"
HOME_URL="http://www.manjaro.org/"
SUPPORT_URL="http://www.manjaro.org/"
BUG_REPORT_URL="http://bugs.manjaro.org/"

Gem :

$ gem list tesseract-ocr -d      
*** LOCAL GEMS ***

tesseract-ocr (0.1.8)
    Author: meh.
    Homepage: http://github.com/meh/ruby-tesseract-ocr
    License: BSD
    Installed at: /home/noraj/.gem/ruby/2.3.0

    A wrapper library to the tesseract-ocr API.

tessaract :

$ tesseract -v                                     
tesseract 3.04.01
 leptonica-1.73
  libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.6.25 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.5.1

ruby :

$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]

It's not clear if I need to import tesseract or tesseract-ocr in my ruby ?

irb(main):002:0> require 'tesseract'
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/81b6fb2baace695a88ac35bc54fcc39bf2dc1e42.log
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders/c.rb:114:in `shared_object'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:90:in `block in build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `instance_eval'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:54:in `singleton_inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:39:in `inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:30:in `<module:BaseAPI>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:27:in `<module:C>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<module:Tesseract>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c.rb:89:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/api.rb:26:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract-ocr.rb:35:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract.rb:25:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in `rescue in require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:40:in `require'
    from (irb):2
    from /usr/bin/irb:11:in `<main>'
irb(main):003:0> require 'tesseract-ocr'
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/81b6fb2baace695a88ac35bc54fcc39bf2dc1e42.log
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders/c.rb:114:in `shared_object'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:90:in `block in build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `instance_eval'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:54:in `singleton_inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:39:in `inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:30:in `<module:BaseAPI>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:27:in `<module:C>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<module:Tesseract>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c.rb:89:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/api.rb:26:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract-ocr.rb:35:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from (irb):3
    from /usr/bin/irb:11:in `<main>'

It's also not clear if tesseract (distribution package for exemple) is needed for the tesseract-ocr ruby gem ? It's even not clear if tesseract ruby gem is needed for tesseract-ocr ruby gem ?

HERE is a full ffi-inline error log file.

jef-abraham commented 7 years ago

This is still an issue:

tesseract 3.04.01 leptonica-1.74 libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.25 : libtiff 4.0.6 : zlib 1.2.8

LOCAL GEMS tesseract-ocr (0.1.8)

Error: In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0: /usr/include/tesseract/baseapi.h:356:8: note: initializing argument 1 of ‘void tesseract::TessBaseAPI::SetImage(Pix)’ void SetImage(Pix pix); ^~~~ /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb: In function ‘bool process_pages(tesseract::TessBaseAPI, const char, STRING)’: /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:183:55: error: no matching function for call to ‘tesseract::TessBaseAPI::ProcessPages(const char&, NULL, int, STRING&)’ return api->ProcessPages(filename, NULL, 0, output); ^ In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0: /usr/include/tesseract/baseapi.h:541:8: note: candidate: bool tesseract::TessBaseAPI::ProcessPages(const char, const char, int, tesseract::TessResultRenderer) bool ProcessPages(const char filename, const char retry_config, ^~~~ /usr/include/tesseract/baseapi.h:541:8: note: no known conversion for argument 4 from ‘STRING’ to ‘tesseract::TessResultRenderer’ /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb: In function ‘bool process_page(tesseract::TessBaseAPI, Pix, int, const char, STRING)’: /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:189:71: error: no matching function for call to ‘tesseract::TessBaseAPI::ProcessPage(Pix&, int&, const char&, NULL, int, STRING&)’ return api->ProcessPage(pix, page_index, filename, NULL, 0, output); ^ In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0: /usr/include/tesseract/baseapi.h:556:8: note: candidate: bool tesseract::TessBaseAPI::ProcessPage(Pix, int, const char, const char, int, tesseract::TessResultRenderer) bool ProcessPage(Pix pix, int page_index, const char filename, ^~~ /usr/include/tesseract/baseapi.h:556:8: note: no known conversion for argument 6 from ‘STRING’ to ‘tesseract::TessResultRenderer*’

Mahesh8 commented 7 years ago

I tried using this

brew install https://raw.githubusercontent.com/Homebrew/homebrew/8ba134eda537d2cee7daa7ebdd9f728389d9c53e/Library/Formula/tesseract.rb

to install a downgraded version of Tesseract on my Mac. However, I get the following error

Error: Calling Resource#sha1 is disabled! Use Resource#sha256 instead. /Users/maheshmesta/Library/Caches/Homebrew/Formula/tesseract.rb:123:in `block (2 levels) in '

How do I rectify this issue?

tjaklitsch commented 7 years ago

@Mahesh8 Tried the same, getting nowhere so far

enriquebrgn commented 7 years ago

After fiddling around for a while I came up with a solution. I've modified the file to use sha256 and also update the broken links in the file.

jpperlm commented 6 years ago

This fix doesn't seem to be working anymore - is there a current workaround?

zachfeldman commented 6 years ago

It doesn't work because it has some outdated homebrew terminology. I commented out a few lines and was able to install Tesseract 3.0.2 and make this lib work!

For anyone looking for which lines to comment out: https://gist.github.com/zachfeldman/bfc7bac4543d466e9c096d585e373fbf

jasonfb commented 3 years ago

thank you @zachfeldman -- with the file above I'm getting

Error: Tesseract: Calling `sha256 "digest" => :tag` in a bottle block is disabled! Use `brew style --fix` on the formula to update the style or use `sha256 tag: "digest"` instead.

I wonder where this is coming from because I don't see any such syntax in your Tesseract.rb (from gist) above

Hyperadministrator commented 1 year ago

There are 2 weird download links in the script mentioned in the solution here https://github.com/meh/ruby-tesseract-ocr/issues/50#issuecomment-327005723 which I don't trust. My main concern is with the GoogleDrive link (which is now also broken as well). Therefore issue still present for me.