PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

PDF::API2->open() holds a filesystem lock until the returned object goes out of scope #170

Open PhilterPaper opened 2 years ago

PhilterPaper commented 2 years ago

(opened by chrispitude 28 July 2021) ssimms/pdfapi2#34

This is a weird bug that I see in Windows (Strawberry Perl) but not linux (native perl).

When I open a PDF file with PDF::API2->open(), somehow there is a lingering filesystem lock that prevents me from deleting the input file until the returned PDF object goes out of scope.

For example, this code:

my $pdf = PDF::API2->open('test.pdf');
unlink 'test.pdf' or warn "Could not unlink 'test.pdf': $!";

results in the following error:

Could not unlink 'test.pdf': Permission denied at bad.pl line 12.

But if I undefine $pdf (or force it to go out of scope):

my $pdf = PDF::API2->open('test.pdf');
$pdf = undef;
unlink 'test.pdf' or warn "Could not unlink 'test.pdf': $!";

then the unlink operation on the input file succeeds.

I tried the following versions, and the bug occurs with all of them:

Tiny testcase at: testcase.tar.gz

PhilterPaper commented 2 years ago

(reply by ssimms on 30 July 2021)

I think that might be working as intended. PDF::API2 doesn't read the entire file into memory since 2.039 in order to lower memory usage on large files, so it keeps the file open in order to be able to access parts of the file as needed.

If you need to work around this, you can stringify the PDF and open the string instead:

    sub memory_is_not_a_problem {
        my $file = shift();
        my $pdf = PDF::API2->open($file);
        return PDF::API2->open_scalar($pdf->stringify());
    }

As for why it's working on Linux but not Windows, I know that Linux filesystems tend to allow programs to continue to access deleted files as long as they previously had them open. Perhaps Windows filesystems prefer to ensure that a file is actually deleted as soon as the delete call happens, failing if that's not possible.

PhilterPaper commented 2 years ago

I haven't tried this one yet. As the partial read of large files is not yet implemented in PDF::Builder, this may not (yet) be applicable. I will keep this one in mind, should I implement the large file partial read.