mikehaertl / php-pdftk

A PDF conversion and form utility based on pdftk
MIT License
952 stars 128 forks source link

Pdf merge fails when more than 1500 files are merged. #306

Closed kalius closed 1 year ago

kalius commented 1 year ago

Error message: Command unexpectedly terminated without error message

I've tried directly in CLI, it works there. Generated file size is 100MB+, each file is from 50-70kb.

I am using latest php-pdftk version: 0.13 on linux,.

mikehaertl commented 1 year ago

It's hard to say what's wrong here, so I can not provide much help.

When you run it on CLI does the command create a lot of output to the terminal?

If so it could be a problem with the way we capture huge outputs. This is done in https://github.com/mikehaertl/php-shellcommand in the block following here:

https://github.com/mikehaertl/php-shellcommand/blob/master/src/Command.php#L434-L438

I spent quite some time to find a solid solution but I still can imagine that the current implementation is not perfect. So I suggest to study that code and play around with it. You could also try to disable non blocking mode. Something like:

$pdf = new Pdf('x.pdf', ['nonBlockingMode' => false]);

But this may introduce the problems with hanging processes again.

mikehaertl commented 1 year ago

Oh and one more thing: Does it work for smaller batches or simple operations? Just to ensure that you got the basic configuration right.

kalius commented 1 year ago

Yes, it works perfectly up to ~1300 files. It is probably related to exec or proc_open command since generated command gets quite huge.

I've tested in CLI with command pdftk *.pdf cat output mergedfiles.pdf and it worked fine for all files in folder.

kalius commented 1 year ago

Okay, I've now tried also full command (full length) directly in cli and it also worked.

nonBlockingMode no difference...

mikehaertl commented 1 year ago

Hmm, yeah, sounds like there could be a limit to the length of the command arg for proc_open(). But I couldn't find anyhting. If you really have time, you could try to call proc_open() with the command as array of args in your own script. I don't know if it makes a difference.

Another hacky workaround: Split the merge up in several batches. It's not nice but at least it should then work.

kalius commented 1 year ago

Thanks for suggestion. I'll implement merging in batches and leave debugging proc_open for later.

Probably we can close this issue for now.