Closed jozefbaranec closed 3 months ago
@jozefbaranec thanks for reporting this issue. We've already encountered a number of test files with this issue, but always treated it silently as a minor issue, processing both marked content sequences (with identical MCIDs) as if they belong to the same parent in the structure tree.
I agree this issue seems to be more severe, and some other tools process this case differently. So, we'll report this deviation from ISO 32000-2 as a WARNING log message
Another related issue with broken structure of marked content sequences is when one sequence is contained in another (both having possibly different MCIDs). We'll add this check as well, as we see different implementations handing this violation of the spec in non-consistent ways.
Latest dev version adds log warnings in case of duplicated MCIDs or when one marked content sequence is embedded into another.
Added to the latest veraPDF release 1.26
The attached PDF has a content with two objects with MCID 23 separated by an Artifact. PDF/UA validation does not report this issue. duplicate-mcid.pdf
the content: