GoyaPtyLtd / BaseElements-Plugin

FileMaker Pro plugin used for BaseElements to provide file, dialog and XSLT functions.
http://www.goya.com.au/baseelements/plugin
151 stars 51 forks source link

Error/Crash with function BE_PDFPageCount after being called multiple times on a large PDF #224

Open bowdendata opened 2 months ago

bowdendata commented 2 months ago

Describe the bug In a script that is running as PSOS, we are building a PDF report in the Windows TEMP folder by creating an initial PDF using the FM function of Save Records As PDF. We are then looping and using the Save Records As PDF with the Append option. During each loop after appending to the PDF, we are calling the BE_PDFPageCount ( pdfPath ) function. During the 251st loop, the FM Server scripting engine is crashing with a 701 error.

At the time the scripting engine fails, the PDF has right around 757 pages in it and it is about 32MB in size.

To Reproduce We need to know :

  1. Which functions you called: BE_PDFPageCount ( pdfPath )
  2. What were the parameter values you used: pdfPath is a file reference to the PDF in the Windows TEMP directory.
  3. What the output was: Prior to the crash, the plugin call was returning the page count properly.
  4. Other potential debug information: When the FM Server scripting engine terminates, we have a "DMP" file generated.

Expected behavior The plugin is able to read the page count for PDF's that very large in size.

Desktop (please complete the following information):

Additional context

If the trace/debug is large, you may need to save that to a text file and attach separately.

nickorr commented 2 months ago

Doug,

Thanks for all that info. Do you have any completed pdfs you can test with? I'm wondering if the issue is due to the size of the file being beyond what the PDF library can cope with, or whether it's an issue with being called that many times.

Are you able to test the page count function on the same environment ( via FMS etc ) on one of the very large files?

Alternatively are you able to generate the page count less often, or before the append?

Or I may be able to come up with other command line ways of doing that.

At the moment we're in a big rebuild process of the base podofo library, so that may fix a bug but will take a while to complete.

Cheers, Nick

bowdendata commented 2 months ago

Hi Nick,

We did some more testing with altered code and found the following:

Running the script on our Windows Server as PSOS (like before)

Generate an initial, smaller PDF of a set of records using built-in Save Records As PDF to the TEMP folder Get the page count of the PDF just generated using BE Get Page Count. Loop Generate another PDF of a set of records using built-in Save Records As PDF to the TEMP folder Get the page count of the PDF just generated using BE Get Page Count . These files averaged a couple of pages each. Used the BE Append PDF function to add the PDF just generated to the initial PDF. End Loop

One report we ran looped about 550 times and another one way more than that. The BE Get Page Count call did not cause any errors nor did the BE Append PDF function. The first PDF mentioned had 1149 pages and was about 42MB in size. The second one was over 2000 pages but I didn’t see how big it was (my colleague was doing the work).

We haven’t had the chance to try running the BE Get Page Count function at the end of the script after getting out of the loop. I suspect it will fail, but of course, not sure.

So based on this testing, I think it is not an issue with the number times the BE Get Page Count function is called, but rather it is a size issue. When we were using the BE Get Page Count function against the full PDF as it was being built, it died consistently right after the PDF reached 757 pages.

I hope this helps. Let me know if you would like us to try anything else. I will see about asking my colleague to try the Page Count against the 1149 page report to confirm that it quits in that scenario.

Regards, Doug


Douglas de Stwolinska @.***

On Jul 9, 2024, at 9:08 PM, Nicholas Orr @.***> wrote:

Doug,

Thanks for all that info. Do you have any completed pdfs you can test with? I'm wondering if the issue is due to the size of the file being beyond what the PDF library can cope with, or whether it's an issue with being called that many times.

Are you able to test the page count function on the same environment ( via FMS etc ) on one of the very large files?

Alternatively are you able to generate the page count less often, or before the append?

Or I may be able to come up with other command line ways of doing that.

At the moment we're in a big rebuild process of the base podofo library, so that may fix a bug but will take a while to complete.

Cheers, Nick

— Reply to this email directly, view it on GitHub https://github.com/GoyaPtyLtd/BaseElements-Plugin/issues/224#issuecomment-2219282260, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKLBFXRE74EPTWJVWGMG3DZLSCP5AVCNFSM6AAAAABKTTZ62OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJZGI4DEMRWGA. You are receiving this because you authored the thread.

nickorr commented 2 months ago

We're on a fairly old release of the podofo library, but it seems like their might be a point release that fixes a memory allocation bug on large PDF files :

https://sourceforge.net/p/podofo/code/2038/

As best I can tell, this is fixed in 0.9.8 and we're on 0.9.7, so I may be able to update to that.

( Podofo switched to a new API structure for 0.10 and now all our code needs to be re-written, so that will be a lot of work, but the bug fix is before that change I think. )

I don't know when I can have a new build ready for windows though, I've been working on the library re-compiles for Mac and Linux, and haven't started on Windows yet ... If you know any C coding people with Windows Visual Studio experience, let me know.

Cheers, Nick