CrossRef / pdfextract

MOVED TO https://gitlab.com/crossref/pdfextract
https://gitlab.com/crossref/pdfextract
MIT License
508 stars 89 forks source link

font_metrics.rb:42:in `initialize': undefined method `ascent' #4

Closed eelcovisser closed 10 years ago

eelcovisser commented 11 years ago

I installed pdf-extract using gem install and I'm getting the following error. A change in the library?

Update: downgrading to ruby-1.9.1 does not help

$ pdf-extract --trace extract --references --titles d912f50dae928909ed.pdf
/Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/font_metrics.rb:42:in `initialize': undefined method `ascent' for #<PDF::Reader::Font:0x007fc611c82650> (NoMethodError)
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:134:in `new'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:134:in `block in build_fonts'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:131:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:131:in `build_fonts'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:163:in `block (2 levels) in include_in'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:81:in `call'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:81:in `block (2 levels) in expand_listeners_to_callback_methods'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:170:in `block in invoke_calls'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:169:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:169:in `invoke_calls'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:42:in `block in parse'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:38:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:38:in `parse'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:53:in `view'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/bin/pdf-extract:115:in `block (4 levels) in <top (required)>'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/bin/pdf-extract:112:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/bin/pdf-extract:112:in `block (3 levels) in <top (required)>'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/command.rb:180:in `call'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/command.rb:180:in `call'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/command.rb:155:in `run'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/runner.rb:402:in `run_active_command'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/runner.rb:78:in `run!'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/delegates.rb:11:in `run!'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/import.rb:10:in `block in <top (required)>'
arski commented 11 years ago

same here.. any idea what this is about?

kjw commented 11 years ago

Hi guys,

Can you try to force an install of the dependency "pdf-reader", version 0.1.1? It seems that later versions of this dependency have moved some methods around. Also, you will need a 1.9.3 version of Ruby, since the code uses Array#sort_by!, which I believe was introduced in 1.9.3.

It has been a while since I looked at this code but I'm planning to get back to it in the next few weeks. First task will be a rationalisation of dependencies - support the latest version of each, and support for Ruby >= 1.9.1.

arski commented 11 years ago

Hi there, thanks a lot for getting back!

Tried installing version 0.1.1 of the pdf-reader gem, but that doesnt seem to exist (ERROR: Could not find a valid gem 'pdf-reader' (= 0.1.1) in any repository). In case you made a typo and wanted to say 1.1.1 - I tried that and it seems to be working indeed.

Worked with Ruby 1.9.1 too (well as far as I can see at least - didn't get any errors).

kjw commented 11 years ago

Ah yes, I meant 1.1.1! Odd that it works with 1.9.1. I guess I must be wrong about sort_by! only being 1.9.3.

kim-em commented 11 years ago

Just had the same issue, with the same work around succeeding.

PrincessPeachey commented 11 years ago

I'm having the same issue. Was originally presenting under Ruby 2.0.0 and pdf-reader 1.1.1. Downgraded to Ruby 1.9.3, reacquired pdf-extract and pdf-reader 1.1.1 and error still persists. Any ideas? I am a newbie to Ruby so go easy on me :)

kjw commented 11 years ago

Hi there,

From looking at the feedback on Github Issues it seems that people are getting pdf-extract to work with Ruby 1.9.3 and Ruby 2.0.0 so long as they switch to pdf-reader 1.1.1 . Now, currently pdf-extract is defined with a dependency on pdf-reader 1.1.0 so I recommend doing a force install of pdf-reader:

$ gem uninstall pdf-reader $ gem install pdf-reader -v 1.1.1

Though, if you are including pdf-extract as a gem dependency in a project managed by bundler, you will want to do this in your bundler-managed project directory:

$ bundle exec gem uninstall pdf-reader $ bundle exec gem install pdf-reader -v 1.1.1

At least, I think that should change the version of the gem that bundler is using. Better would be to patch the gemspec file in pdf-extract and set it's pdf-reader dependency to 1.1.1 . Then you won't overwrite any changes to the bundler gems when you do a subsequent 'bundle install'.

Hope this helps...

kjw commented 11 years ago

Sorry I just reread your comment - you've already tried this.

Any chance you could post some output?

Ta.

PrincessPeachey commented 11 years ago

Hey! Turns out that I had two versions of pdf-reader installed - 1.1.1 and 1.3.3. I removed both and just reinstalled 1.1.1 and now it works. Thank you!

iamgp commented 10 years ago

Hi

I am getting this error. This is my setup:

pdf-extract (0.1.1) pdf-reader (1.1.1)

I'd really like to get this great program running!! Thanks in advance

cactusspine commented 10 years ago

I got exact same problem undefined method ascent' for #<PDF::Reader::Font:0x00000002576988>. Use --trace to view backtrace pdf-extract --trace /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/runner.rb:398:inrun_active_command': invalid command (Commander::Runner::InvalidCommandError) from /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/runner.rb:78:in run!' from /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/delegates.rb:11:inrun!' from /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/import.rb:10:in `block in <top (required)>'

I have tried downgrade to pdf-reader 1.1.1, upgrade to ruby 1.9.3 ,however nothing worked. I would really like to used this nice tool, please help me... thanks in advance

kjw commented 10 years ago

I think you're seeing two problems.

The first is a missing ascent method on PDF::Reader::Font. Not sure why that isn't present in pdf-reader 1.1.1 as someone above in this thread got pdf-extract working against that version.

Second is that pdf-extract is not accepting a '--trace' paramter - which is unfortunately stopping you from printing a trace for the first issue.

From what I remember pdf-extract at some point applied a monkey patch, or whatever the term is, to the pdf-reader Font class to include an ascent method. At some point I believe this was taken out because pdf-reader incorporated the method, I believe in version 1.1.1 onwards. Thus the method was taken out of pdf-extract. Not sure why this has now disappeared from pdf-reader, too.

afsartori commented 10 years ago

I'm a bit lost trying to fix this issue here. Replacing pdf-reader-1.3.3 by pdf-reader-1.1.1 (the fix suggested above) breaks prawn-0.14.0 (which depends on pdf-reader ~> 1.2). This wouldn't be a problem, except that then pdf-extract complains it couldn't activate prawn-0.14.0.

pdf-extract extract --references Beu_2010_BAP_377-378.pdf 
/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1990:in `raise_if_conflicts': Unable to activate prawn-0.14.0, because pdf-reader-1.1.1 conflicts with pdf-reader (~> 1.2) (Gem::LoadError)
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1163:in `activate'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1199:in `block in activate_dependencies'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1185:in `each'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1185:in `activate_dependencies'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1167:in `activate'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_gem.rb:48:in `gem'
    from /usr/bin/pdf-extract:22:in `<main>'

How did you guys work around this issue? Thanks

UPDATE: Ok, worked around the problem here by installing a previous version of prawn (0.12.0):

$ sudo gem install prawn -v 0.12.0
iamgp commented 10 years ago

The fix mentioned above doesn't fix anything for me! :(

msegado commented 8 years ago

For the record, afsartori's workaround worked for me on Ubuntu 14.04: I installed prawn 0.12.0 and pdf-reader 1.1.1, uninstalled pdf-reader 1.3.3., and was able to run pdf-extract successfully.