yob / pdf-preflight

Check PDF files conform to various standards
MIT License
80 stars 20 forks source link

NoRegistrationBlack rule gets into a tight loop #22

Open tomtaylor opened 12 years ago

tomtaylor commented 12 years ago

I've got a PDF that seems to get into a tight loop in the NoRegistrationBlack rule. I'll email you a link to the PDF separately, as it's a customer of ours and I don't want to make it public.

The trace, when I kill it, looks like this:

/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page_state.rb:288:in `load'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page_state.rb:288:in `clone_state'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page_state.rb:37:in `save_graphics_state'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page_state.rb:171:in `invoke_xobject'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/rules/no_registration_black.rb:43:in `invoke_xobject'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page.rb:144:in `block in callback'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page.rb:143:in `each'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page.rb:143:in `callback'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page.rb:130:in `content_stream'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader/page.rb:95:in `walk'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:102:in `block in check_pages'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:101:in `each'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:101:in `check_pages'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:74:in `block in check_io'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-reader-2625af19808b/lib/pdf/reader.rb:159:in `open'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:72:in `check_io'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:67:in `block in check_filename'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:66:in `open'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:66:in `check_filename'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/bundler/gems/pdf-preflight-f8c3570c795d/lib/preflight/profile.rb:51:in `check'
/srv/npc/rails/lib/tasks/preflight.rake:11:in `block (3 levels) in <top (required)>'
/srv/npc/rails/lib/tasks/preflight.rake:10:in `open'
/srv/npc/rails/lib/tasks/preflight.rake:10:in `block (2 levels) in <top (required)>'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:205:in `call'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:205:in `block in execute'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:200:in `each'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:200:in `execute'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:158:in `block in invoke_with_call_chain'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:151:in `invoke_with_call_chain'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/task.rb:144:in `invoke'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:116:in `invoke_task'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:94:in `block (2 levels) in top_level'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:94:in `each'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:94:in `block in top_level'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:133:in `standard_exception_handling'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:88:in `top_level'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:66:in `block in run'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:133:in `standard_exception_handling'
/home/npc/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/rake/application.rb:63:in `run'
/home/npc/.rbenv/versions/1.9.3-p0/bin/rake:32:in `<main>'
yob commented 12 years ago

I ran this test script on the sample file you provided

require 'preflight'

class CustomProfile
  include Preflight::Profile

  rule Preflight::Rules::NoRegistrationBlack
end

filename = "illustrating-mood.pdf"
preflight = CustomProfile.new
preflight.check(filename).each do |issue|
  puts issue.inspect
end

here's the output

⚡ time ruby foo.rb
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
'All' separation color detected
ruby foo.rb  264.94s user 0.97s system 96% cpu 4:34.46 total

Can you replicate this behaviour? Is it possible it's not hanging, just taking a really long time?

I'd like to do some profiling of pdf-reader to speed things up in general, maybe this is a good PDF to start with.

yob commented 12 years ago

Here's the output from the same script checking restart.pdf

⚡ time ruby foo.rb            
ruby foo.rb  170.30s user 1.71s system 95% cpu 3:00.86 total
tomtaylor commented 12 years ago

Ah yes, you're right. On our production system they're wrapped in a 180 second timeout, and in dev I probably only left for a minute or two.

I'll up the timeout in production, but if we can support any work to improve performance in situations like these, please let me know.

On 20 May 2012, at 07:09, James Healyreply@reply.github.com wrote:

Here's the output from the same script checking restart.pdf

⚡ time ruby foo.rb
ruby foo.rb 170.30s user 1.71s system 95% cpu 3:00.86 total


Reply to this email directly or view it on GitHub: https://github.com/yob/pdf-preflight/issues/22#issuecomment-5806867