gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.21k stars 69 forks source link

TextField calculation #298

Closed loust333 closed 3 months ago

loust333 commented 4 months ago

Hello,

First and foremost, I want to express how amazing this gem is! 😄

We are currently using the pdf-forms gem and have encountered an issue with PDFs that have a read-only text field set to calculate the sum of text fields A+B+C. Unfortunately, upon saving, the calculation does not update.

I tested with HexaPDF to see if it would resolve this issue, but it also does not calculate automatically. I've checked the documentation, but I couldn't find a solution for this problem.

Could you please confirm if this is a known issue or if I missed something in the documentation?

If other people encounter the same problem, they will find at least one discussion about it. 😄

Here's an example with percentages (second page): ccss-formulaire-activites-etranger-pluriactivite-EN.pdf

Keep up the good work ! :)

gettalong commented 4 months ago

Thanks :smile:

Calculations in PDF are not really built into the main PDF specification but are done via Javascript functions. I already implemented one such function in HexaPDF, see https://github.com/gettalong/hexapdf/blob/master/lib/hexapdf/type/acro_form/appearance_generator.rb#L508

In case of the provided PDF one can dig through all the objects to arrive at the following:

$ hexapdf inspect ccss-formulaire-activites-etranger-pluriactivite-EN.pdf s 3426
AFSimple_Calculate("SUM", new Array ("txt_pct_at.0", "txt_pct_at.1", "txt_pct_be.0", "txt_pct_be.1", "txt_pct_bg.0", "txt_pct_bg.1", "txt_pct_ch.0", "txt_pct_ch.1", "txt_pct_cy.0", "txt_pct_cy.1", "txt_pct_cz.0", "txt_pct_cz.1", "txt_pct_de.0", "txt_pct_de.1", "txt_pct_dk.0", "txt_pct_dk.1", "txt_pct_ee.0", "txt_pct_ee.1", "txt_pct_es.0", "txt_pct_es.1", "txt_pct_fi.0", "txt_pct_fi.1", "txt_pct_fr.0", "txt_pct_fr.1", "txt_pct_gr.0", "txt_pct_gr.1", "txt_pct_hu.0", "txt_pct_hu.1", "txt_pct_ie.0", "txt_pct_ie.1", "txt_pct_is.0", "txt_pct_is.1", "txt_pct_it.0", "txt_pct_it.1", "txt_pct_kr.0", "txt_pct_kr.1", "txt_pct_lie.0", "txt_pct_lie.1", "txt_pct_lt.0", "txt_pct_lt.1", "txt_pct_lu.0", "txt_pct_lu.1", "txt_pct_lv.0", "txt_pct_lv.1", "txt_pct_mt.0", "txt_pct_mt.1", "txt_pct_nl.0", "txt_pct_nl.1", "txt_pct_no.0", "txt_pct_no.1", "txt_pct_pl.0", "txt_pct_pl.1", "txt_pct_pt.0", "txt_pct_pt.1", "txt_pct_ro.0", "txt_pct_ro.1", "txt_pct_se.0", "txt_pct_se.1", "txt_pct_si.0", "txt_pct_si.1", "txt_pct_sk.0", "txt_pct_sk.1", "txt_pct_uk.0", "txt_pct_uk.1"));

This is the Javascript function that performs the calculation. I would need to dig through all the other fields to see if there are other functions that do something similar.

The one already implemented by HexaPDF is also used by the PDF:

$ hexapdf ins /tmp/ccss-formulaire-activites-etranger-pluriactivite-EN.pdf 3427
<<
  /S /JavaScript
  /JS (AFNumber_Format\(0, 1, 0, 0, "", false\);)
  /Type /Action
>>

This formats the total field.

The problem with those Javascript functions is actually mentioned on the documentation page for the interactive forms support of HexaPDF.

I can have a look at the documentation for the AFSimple_Calculate function to see how it could be implemented. Let me know if this would help you!

loust333 commented 4 months ago

Thanks for the quick response, @gettalong! 😄

In this PDF, that's the only script.

If you find a way to implement this, it would be AMAZING! 😃

For the PDF that I attached, that's not an issue, but in another project, we have a 20-30 page interactive form for tax statements where the calculation script occurs at least once on every page. 😭

Thanks a lot for the feedback; I appreciate it!

gettalong commented 4 months ago

I will see what I can find/do.

gettalong commented 4 months ago

@loust333 Would it be possible for you to provide the 20-30 page document with multiple calculations that you mentioned?

loust333 commented 4 months ago

@gettalong sure ! :)

Here is one of the PDFs: 100F-2023.pdf (See page 5)

gettalong commented 4 months ago

@loust333 Thanks for the PDF!

I have looked at the PDF but, alas, it does at least some calculations manually, like this (line breaks by me):

/** BVCALC N\.0501+N\.0505+N\.0509-N\.0513-N\.0517+N\.0521 EVCALC **/
event.value=AFMakeNumber(getField("N.0501").value)+
AFMakeNumber(getField("N.0505").value)+
AFMakeNumber(getField("N.0509").value)-
AFMakeNumber(getField("N.0513").value)-
AFMakeNumber(getField("N.0517").value)+
AFMakeNumber(getField("N.0521").value)

This is much more complicated than the AFSimple_Calculate script and not something I can easily deal with using just string operations.

loust333 commented 4 months ago

Thanks a lot for the feedback @gettalong! 😊

I understand. We'll have to manage without it then.

There are so many issues with this form. There are a lot of users that don't use adobe anymore as main app to display PDFs which makes it partially useless. 😅 🤷‍♂️

If you want, you can close this issue. Or keep it open, if this feature is in your milestones. :)

Have a nice evening ! 🙂

gettalong commented 4 months ago

Yeah, I currently have no plan on incorporating a whole JavaScript engine to just make this work :sweat_smile:

However, I will implement support for the AFSimple_Calculate function with one of the next releases.

gettalong commented 4 months ago

@loust333 FYI The AFSimple_Calculate function is already working and from what I've found while further investigating I think I can make the one in https://github.com/gettalong/hexapdf/issues/298#issuecomment-2067288888 also work.

loust333 commented 4 months ago

@gettalong amazing 👏

I will discuss this with my colleague tomorrow.

Is it possible to implement some examples in the documentation ? :)

gettalong commented 4 months ago

Yes, I will also implement convenience methods for setting formatting and calculation actions and there will be a new example show-casing AcroForm JavaScript methods. The interactive forms documentation on the website (https://hexapdf.gettalong.org/documentation/interactive-forms/) will also be updated to include more information on what is supported and how it works in HexaPDF.

gettalong commented 3 months ago

This is now implemented and documented.

I'm currently tracking down a problem with your provided files where the appearance of a form field is created by HexaPDF but not displayed in a viewer. Once that is fixed, I will make the release.

gettalong commented 3 months ago

Okay, found the culprit: The widget annotation is flagged as "hidden" and this flag is toggled to false via a field validation JavaScript action. I think the best way forward in such a case is to tell HexaPDF to always show the widget annotation and set "hidden" to false.

gettalong commented 3 months ago

@loust333 Please try the new release with the changes!

loust333 commented 3 months ago

@gettalong thanks a lot ! I will ask our team to test the gem with the new changes. As soon as we have a feedback. I will come back to you. :)

gettalong commented 3 months ago

@loust333 Great! Let me know if you need anything else!

loust333 commented 3 months ago

Hello @gettalong,

I tested the recalculate_fields, and it works in the simple PDF like a charm.

Some questions came to mind:

Concerning the tax statement, I did not test it. We have currently other important features that need improvement but in the end of the year/ beginning of next year, they may have time to test and implement the gem. 🥳

Here is the current service that I created to merge the PDF with the data (I prepare the data hash with a presenter):

class PdfMergerService
  def initialize(source:, data: {})
    @pdf_origin = HexaPDF::Document.open("public/#{source}")
    data.each do |k, v|
      # puts k, v
      @pdf_origin.acro_form.field_by_name(k).field_value = v
    end
    @pdf_origin.acro_form.recalculate_fields
  end

  def read_io
    io = StringIO.new
    @pdf_origin.write(io)
    io.rewind
    io
  end
end
gettalong commented 3 months ago

@loust333 Regarding your questions: No, if the form doesn't have fields that need calculation it is nearly a no-op. The reason for this is that the PDF form object contains an entry /CO that defines the calculation order (because sometimes one calculation depends on the result of another one). And this entry contains all fields in the PDF that need to be calculated. So if this entry is empty, nothing is done.

If you find that you need anything else from HexaPDF to integrate it into your application, let me know.

As for your service class: The next release of HexaPDF will have a acro_form.fill(data) method to more easily fill fields.