Closed G5W closed 7 months ago
Can you send a pr?
I think that you are asking for a pull request, which I understand to mean a proposed correction to the code. However, I am not a C++ programmer and could not find the part of the code that creates the filename. I am sorry, but I cannot help with fixing the code.
BTW, Above I proposed computing the exact size needed for the index numbers in the file name. I realized later that a simpler solution might be to always use a 9 digit number. That would accommodate up to 1 billion pages and so would be very unlikely to cause problems for anyone.
I am not a C++ programmer either, but I'd also appreciate if this could be fixed.
The described behaviour stems from this line:
https://github.com/ropensci/qpdf/blob/da5d359a464d1abbac3b1ac17345f7d39a406b7d/src/bindings.cpp#L45
int_to_string
receives pages.size()
as the number of digits of the resulting string, when it should for n
pages receive floor(log10(n))+1
(in R-syntax).
(If I had to fix it, I'd #include <math.h>
in bindings.cpp
and use floor/log from that file. But I perhaps Rcpp offers something better here?)
Yet another non-C++ programmer but one of my scripts also just choked on the same problem.
The description from Enrico @enricoschumann looks very useful for me but I also don't know enough C++ to turn this into a fix in the code (and ultimately into a PR).
Jeroen @jeroen, is this something you might be able to have a look again in the not so distant future?
Can you please include example code? It seems to work as intended for me, the filename is postfixed with a number, but not excessive.
> pdf_file
[1] "/var/folders/gz/rwkfg8wx1wqgcr0ccpz42dyc0000gn/T//RtmpZvEBI7/output.pdf"
> pdf_length(pdf_file)
[1] 3
> pdf_split(pdf_file)
[1] "/private/var/folders/gz/rwkfg8wx1wqgcr0ccpz42dyc0000gn/T/RtmpZvEBI7/output_001.pdf"
[2] "/private/var/folders/gz/rwkfg8wx1wqgcr0ccpz42dyc0000gn/T/RtmpZvEBI7/output_002.pdf"
[3] "/private/var/folders/gz/rwkfg8wx1wqgcr0ccpz42dyc0000gn/T/RtmpZvEBI7/output_003.pdf"
Ah I found the issue. It is fixed now, you can install the new version from https://ropensci.r-universe.dev/qpdf. I'll try to release to CRAN soon.
FWIW, if you had included a reprex code it would have been much easier for me to fix this sooner.
Fantastic, Jeroen, thanks for resolving this issue - and apologies for not posting a reproducible example.
BTW: I think this also resolves the issue that had prompted PR https://github.com/ropensci/qpdf/pull/6 which can be closed now IMO. Your solution is nicer.
Submitted qpdf 1.3.3 to cran.
Fantastic, thanks, already live.
great, many thanks!
When pdf_split is used on a file with n pages, it numbers the results file using an n digit number. So a 20 page pdf file gets a name with a 20 digit number on the end. I have to split many files with a large number of pages. Any file with more than 260 pages gets a file name with more than 260 characters. On a windows machine, this causes the program to error out and fail.
To have a unique file number for an n-page file should only require ceiling(log10(n))+1 digits not n digits.
I am using qpdf Version: 1.1 under R version 4.1.3.