gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.21k stars 69 forks source link

Default appearance-state for check_box fields? #208

Closed andi-dev closed 1 year ago

andi-dev commented 1 year ago

Hi again :)

I investigated another issue I had when flattening an acro_form. The pdf is actually the same I send you via email regarding #203

This is the code I am using:

 pdf_doc.acro_form.flatten.each do |field|
    field.flag(:read_only)
    field.each_widget { |widget| widget.form_field.delete_widget(widget) }
 end
 pdf_doc.catalog.delete(:AcroForm)
 pdf_doc.delete(pdf_doc.acro_form)
 pdf_doc.dispatch_message(:complete_objects)

This results in some of the checkboxes on the page looking squeezed, while others look fine.

image

When digging a bit deeper, I found that some of the checkboxes in the form have no :V-property, while some have it set to :Ja or :Off. The check_boxes with no :V are the ones that could not be flattened.

Then I noticed that the current implementation of create_check_box_appearances is simply setting the widgets :AS property to @field[:V], while [create_radio_button_appearances](https://github.com/gettalong/hexapdf/blob/b048a2387c5d56e357353541a57cf3dc09180e7b/lib/hexapdf/type/acro_form/appearance_generator.rb#L187) is falling back to :Off when @field[:v] is not on_name.

Would it also make sense for checkboxes to fall back to :Off, when they are not explicitly on? I tried to make sense of the PDF-specification for checkboxes, but :V is an optional field, so I would assume that it being empty is something we have to expect, and falling back to the :Off-appearance would sound logical to me.

What do you think?

gettalong commented 1 year ago

Yes, I think this is a bug. If a check box field has no /V entry or its value is nil, the /Off appearance should be selected if there is one.

gettalong commented 1 year ago

And I found another bug in the appearance generation code of button fields that would make HexaPDF generate the appearances even though /Off and one other appearance was available.

andi-dev commented 1 year ago

Thank you for confirming it's a bug. Thats good to hear, as this a) means that I didn't get entirely lost in this pdf stuff, and b) that we found the hopefully last thing going wrong with this somewhat weird pdf-file 😅

gettalong commented 1 year ago

Down the rabbit hole I go: The performance of the sample code is not very good, for several reasons that I found. So I'm currently trying out things to make it faster. On the way I discovered yet another subtle bug...

andi-dev commented 1 year ago

Good luck :) Let me know if I can help to test the new implementation. I also have a couple more real-world pdfs with disappearing checkboxes / check ticks, which I'd be happy to test against your fix.

gettalong commented 1 year ago

@andi-dev All three mentioned bugs as well as some of the performance optimizations are implemented and I pushed the changes to devel branch if you'd like to test before the next release.

andi-dev commented 1 year ago

@gettalong thanks again for the fix! I tested it with a bunch of documents which had issues with checkboxes, and most of them look good now ❤️

I just found a similar but apparently entirely unrelated issue, which I describe in https://github.com/gettalong/hexapdf/issues/212