Open Lucas-C opened 3 years ago
@Lucas-C this is a great list. Sorry I did not see this topic earlier.
Here is a screenshot from the readout from Adobe's accessibility checker:
This is generated from a pdf which has been generated using the following code:
from fpdf import FPDF
import lorem
pdf = FPDF()
pdf.set_title(f"Sample PDF")
pdf.set_lang("English")
pdf.add_page()
pdf.set_font("Arial", size=10)
pdf.cell(180, 10, txt="Welcome to a PDF generated in Python's fpdf2 package",align="C", new_y="NEXT", new_x="LMARGIN")
for i in range(5):
print(i)
pdf.multi_cell(180, 10, txt=f"{i+1}) " + lorem.paragraph(),align="L", new_y="NEXT", new_x="LMARGIN")
pdf.output("simple_demo.pdf")
In addition, according to Acrobat there needs to be
Can you try adding the code below in your sample code and see if you still get error on the title? I didn't research much about PDF/A but I suspect it demands the metadata as XMP
pdf.set_xmp_metadata("""<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="">
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Sample PDF</rdf:li>
</rdf:Alt>
</dc:title>
<dc:language>
<rdf:Bag>
<rdf:li>en-US</rdf:li>
</rdf:Bag>
</dc:language>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>""")
I implemented the code as suggested and Adobe still flagged it. I "fixed" the Title and this is the metadata that pikepdf read from it:
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 9.1-c001 79.2a0d8d9, 2023/03/14-11:19:46 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/">
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Sample PDF</rdf:li>
</rdf:Alt>
</dc:title>
<dc:language>
<rdf:Bag>
<rdf:li>en-US</rdf:li>
</rdf:Bag>
</dc:language>
<xmp:ModifyDate>2023-06-01T21:45:58-04:00</xmp:ModifyDate>
<xmp:MetadataDate>2023-06-01T21:45:58-04:00</xmp:MetadataDate>
<xmpMM:DocumentID>uuid:cfac003d-eb66-694a-b654-1c17c505700b</xmpMM:DocumentID>
<xmpMM:InstanceID>uuid:2d2aa7ab-ab4f-1d42-9e8d-8019b057371d</xmpMM:InstanceID>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
I have tried to copy the metadata from a "corrected" pdf to that of a problematic pdf using PikePdf and I have had no success.
I also made an issue request under PikePdf: https://github.com/pikepdf/pikepdf/issues/469
I'm opening this issue to track work to ensure PDF/A-compliant can be generated using
fpdf2
.Wikipedia page about PDF/A: https://en.wikipedia.org/wiki/PDF/A
My current idea would to provide a
get_pdfa_compliance()
method that would returnNone
or'PDF/A-1'
depending on several criteria:not pdf.allow_images_transparency
Feedback & all contributions are welcome on this subject