Open reasonableperson opened 4 years ago
We would be happy to add a -truncate
option for a commercial customer.
The other option is to split bookmarks producing serial numbers 001.pdf, 002.pdf and then rename those according to the output of -list-bookmarks
. But that may be no easier than your suggestion above.
Had a quick look at this, and it seems that UTF8 is not as easy as it seems. Properly:
https://metacpan.org/pod/Unicode::Truncate
An easier version, which could break grapheme clusters but which at least produce a valid UTF8 string is given here:
https://stackoverflow.com/questions/35328529/stdstring-optimal-way-to-truncate-utf-8-at-safe-place
Some PDFs contain bookmarks with very long text, like this rather silly one where bookmarks are named after the first 255 characters of text in each paragraph. This means the
@B
parameter cannot be used withcpdf
's-split-bookmarks
command, at least if you want to add a.pdf
extension to your output or add the bookmark number as a prefix, because most filesystems do not support filenames with a length of over 255 characters.As a workaround, I am going to try parsing the output of
-list-bookmarks
and using it to repeatedly callcpdf in.pdf <bookmark-i-page-number>-<bookmark-i+1-page-number> <truncated-filename>
, but that means I am manually reimplementing much of what's already done by-split-bookmarks
. If there was some way to truncate the result of the@B
parameter without writing my own script, that'd be great.