gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.21k stars 69 forks source link

"undefined method `width` for nil:NilClass" when running `acro_form.validate` #210

Closed andi-dev closed 1 year ago

andi-dev commented 1 year ago

Hi Thomas,

we've been checking the acro_form.validate and acro_form.flatten for a bunch of actual client pdfs.

We ran into a couple of HexaPDF::Error, which we can live with, as we can at least rescue and fall back to not flattening the forms. (are you interested in these errors anyway?)

But we also ran into one NoMethodError (backtrace below).

I had a look into the code and in https://github.com/gettalong/hexapdf/blob/master/lib/hexapdf/type/acro_form/appearance_generator.rb#LL374C32-L374C32 this line you're - I think - trying to decode the check-mark character, falling back to "4" if non is specified?

However we found 3 pdfs where @widget[:MK]&.[](:CA) is an empty string, thus @widget[:MK]&.[](:CA) || '4' evaluates to "".

If I change the code to:

  mark_str = @widget[:MK]&.[](:CA).present? ? @widget[:MK]&.[](:CA) : '4'
  mark = font.decode_utf8(mark_str).first

things seem to work out fine.

Backtrace

Error parsing ../../../Downloads/hexapdf/parse-errors/some_pdf.pdf: #<NoMethodError: undefined method `width' for nil:NilClass>
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/appearance_generator.rb:377:in `draw_marker'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/appearance_generator.rb:146:in `block in create_check_box_appearances'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/content/canvas.rb:341:in `save_graphics_state'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/appearance_generator.rb:145:in `create_check_box_appearances'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/button_field.rb:260:in `block in create_appearances'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/field.rb:264:in `each_widget'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/button_field.rb:253:in `create_appearances'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/form.rb:372:in `block in create_appearances'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/form.rb:131:in `block in each_field'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/pdf_array.rb:183:in `block in each'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/pdf_array.rb:183:in `each_index'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/pdf_array.rb:183:in `each'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/form.rb:135:in `each_field'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/form.rb:371:in `create_appearances'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/type/acro_form/form.rb:491:in `perform_validation'
/Users/buddhabox/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/bundler/gems/hexapdf-36b43ca9228c/lib/hexapdf/object.rb:286:in `validate'
gettalong commented 1 year ago

We ran into a couple of HexaPDF::Error, which we can live with, as we can at least rescue and fall back to not flattening the forms. (are you interested in these errors anyway?)

Yes, please, always :grin: They might indicate an invalid PDF or something that HexaPDF doesn't handle correctly. Either way there might be a way to gracefully handle the situations leading to the errors.

As for the NoMethodError: I will have to test this out but I think an empty string should be valid, displaying nothing.

gettalong commented 1 year ago

Here is an quasi-official comment on how appearance characteristics dictionaries (/MK entry in widget annotation) are to be treated: https://github.com/pdf-association/pdf-issues/issues/56#issuecomment-797856300 - i.e. it is processor dependent.

gettalong commented 1 year ago

Another comment here https://github.com/pdf-association/pdf-issues/issues/23#issuecomment-764216521

gettalong commented 1 year ago

So, following the information from the linked comments and from the fact that if there is no marker character defined, HexaPDF will now draw no marker when an empty /CA entry in the appearance characteristics dictionary is found. Change is live in the devel branch.

gettalong commented 1 year ago

Sorry, too fast... :-/ I forgot to add the /NeedAppearances entry in the main form dictionary so Acrobat just showed me what HexaPDF did instead of showing what it will do... Empty marker style string is the same as no marker style string, i.e. default marker for the field type. Commit adjusted and pushed to devel.

Now it is finished :grin:

andi-dev commented 1 year ago

Thanks again for the quick fix :) I just tried it with all 3 pdfs that previously ran into this error, and everything looks good now!