Kozea / WeasyPrint

The awesome document factory
https://weasyprint.org
BSD 3-Clause "New" or "Revised" License
6.84k stars 653 forks source link

PDF/UA accessibility. Labeled strange. #2153

Open marina31714 opened 1 month ago

marina31714 commented 1 month ago

Hello, I'm trying to generate a PDF from HTML with PDF/UA, but it returns strange tagging. Is this labeling correct? Is there any way to modify it? It is the first time I use your library, and I am very interested in the accessibility part.

I am using Adobe Acrobat Pro to look at the labeling.

Thank you in advance.

HTML:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
    <title>Ejemplo PDF</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            margin: 50px;
        }
        h1 {
            color: pink;
        }
    </style>
</head>
<body>
    <h1>Hello World</h1>
    <p>Lorem ipsum dolor sit amet consectetur adipiscing elit pellentesque, eros blandit porttitor primis mollis nisi in nunc, ante interdum vestibulum viverra mattis et sociosqu. Faucibus a risus laoreet posuere placerat class tempus vehicula, dignissim congue netus odio potenti phasellus malesuada sodales habitant, egestas id imperdiet sociis vitae taciti curabitur.</p>
</body>
</html>

Python (Flask):

from flask import Flask, render_template, make_response
from weasyprint import HTML

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/pdf')
def generate_pdf():
    HTML('./templates/index.html').write_pdf('test_pdf_ua.pdf', pdf_version="1.6",  pdf_variant='pdf/ua-1')
    return "PDF Generated"

if __name__ == '__main__':
    app.run(debug=True)

Result: image

Expected result: image

liZe commented 1 month ago

Hmmm… There’s something strange in these labels, we have to check what’s wrong and try to improve this structure.