vishnukottala / flexpaper

Automatically exported from code.google.com/p/flexpaper
0 stars 0 forks source link

Long documents: Split-page mode doesn't help server-side conversion bottlenecks #238

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
With the current implementation of split-page mode, the first page request runs 
the entire conversion process for all of the pages before returning.  Given 
that the pdf2swf conversion can take a long time on large documents, this is 
often the bottleneck for which users still have to wait.

I've attached modifications I made to pdf2swf_php5.php to fix this issue -- it 
greatly speeds up the user wait time, at least in order to immediately see the 
first few pages.  It forks a child process to do the conversion and returns 
after the first page is converted.  It does require the PCNTL extension for 
PHP, so it might be best to integrate this as a flaggable option.

In addition to the attached changes, the viewer PHP code also needs to close 
off the output while the child process is still running:

echo file_get_contents($swfFilePath);

// force-close the pipe, as a child process running the processing
// may still be executing
ob_end_flush(); 
ob_flush();
flush();
session_write_close();
if (session_id()) session_write_close();

// wait for the child process to complete
pcntl_wait($status);

Original issue reported on code.google.com by kmewh...@gmail.com on 26 Nov 2011 at 7:24

Attachments:

GoogleCodeExporter commented 9 years ago
Great, thanks for your contribution! Just a question - how do you determine # 
number of pages using this approach (prior to the conversion being finished)

Original comment by erik.eng...@devaldi.com on 28 Nov 2011 at 1:49

GoogleCodeExporter commented 9 years ago
In my present use case, I actually know in advance how many pages there are in 
the PDF, so this doesn't pose a problem; however, I use the following function 
to get the number of pages in other contexts:

function pdf_numpages($filename){
        exec("pdftk $filename dump_data | grep -i NumberOfPages", $output);
    $status = preg_match('/\d+\s*\Z/', $output[0], $matches);
    if(empty($matches)){
        return 0;
    }
    return $matches[0]; 
}

This relies on pdftk, so would entail a dependency on this (there is a Windows 
version of pdftk too though)...

Original comment by kmewh...@gmail.com on 28 Nov 2011 at 2:15

GoogleCodeExporter commented 9 years ago
I have added a PHP function for this in our coming version so that there is no 
longer a dependency on counting number of pages in the same way as before. I 
will also look at incorporating the suggested changes for next version!

Thanks again for your contribution! 

Original comment by erik.eng...@devaldi.com on 2 Dec 2011 at 12:30

GoogleCodeExporter commented 9 years ago
and to add to this - we're counting the number of pages without the dependency 
of pdftk.

Original comment by erik.eng...@devaldi.com on 2 Dec 2011 at 12:31

GoogleCodeExporter commented 9 years ago
Fixed in latest version of flexpaper (1.5.0)

Original comment by erik.eng...@devaldi.com on 29 Dec 2011 at 5:19