Closed sciurius closed 1 year ago
Some additional information:
PDF Reference specifications 1.4 through 1.7 explicitly state that the apostrophe after offset hours and minutes is part of the syntax, and hence must be there.
The ISO approved version of PDF Reference 1.7 states no apostrophe following the minutes part.
When PDF::API2 generates a PDF document it starts with "%PDF-1.4" so I'd say the Adobe PDF 1.4 reference is leading here. It says that the trailing apostrophe for both timezone hours and minutes is mandatory.
For practical purposes, I would go for the final apostrophe being optional.
(?:([012345][0-9]\'?) # UT Offset Minutes plus optional apostrophe
I hadn't noticed (or, if I did, had since forgotten) that the Adobe and ISO versions of PDF 1.7 aren't the same. That's unfortunate. In any case, I've been working from the ISO version when adding/updating code. Thanks for making me aware of the differences between the specifications.
My current reading of the ISO versions says that the apostrophe after the offset hour is optional if there isn't an offset minute:
The APOSTROPHE following the hour offset field (HH) shall only be present if the HH field is present. The minute offset field (mm) shall only be present if the APOSTROPHE following the hour offset field (HH) is present).
On that basis, I've made both apostrophes optional, unless both the offset hour and minute are present, in which case there must be an apostrophe between them. I've also added a bunch of tests for the various valid formats and some invalid ones (the code isn't trying to catch all invalid dates, just egregious format errors), just to confirm that all the other variations are working.
So, specifically, these two cases now work but didn't previously:
D:20060102150405-07'
D:20060102150405-07'00'
According to PDF 1.7 section 7.9.4 the (full) format is
YYYYMMDDHHmmSSOHH'mm'
. BothHH
andmm
(if present) must be followed by an apostrophe character. Setting a valid date like20230313194003+01'00'
will result in an errorInvalid date string: D:20230313194003+01'00' at ...
.The check on the date format was introduced in 2.042. The regexp in
sub _is_date
(PDF/API2, around line 555) should be: