digitalfabrik / integreat-cms

Simplified content management back end for the Integreat App - a multilingual information platform for newcomers
https://digitalfabrik.github.io/integreat-cms/
Apache License 2.0
55 stars 33 forks source link

PDF export not working on specific pages in Arabic & Farsi #1498

Open timobrembeck opened 2 years ago

timobrembeck commented 2 years ago

Describe the Bug

Steps to Reproduce

https://admin.integreat-app.de/marburg-biedenkopf/fa/wp-json/ig-mpdf/v1/pdf/

Expected Behavior

The PDF should be shown

Actual Behavior

An internal server error occurs (AttributeError: 'ParaLines' object has no attribute 'lineBreak')

Additional Information

Update: Fixed upstream (see https://github.com/xhtml2pdf/xhtml2pdf/pull/643), so wait until a new release is triggered and update xhtml2pdf to fix the issue.

Also, when updating, we can also remove this workaround: https://github.com/digitalfabrik/integreat-cms/blob/d74612683bdc1ab2ad4c9933965911002ae15c6b/integreat_cms/cms/utils/pdf_utils.py#L114-L115

Traceback ``` self.handle_flowable(flowables) File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/doctemplate.py", line 928, in handle_flowable if frame.add(f, canv, trySplit=self.allowSplitting): File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/frames.py", line 212, in _add flowable.drawOn(canv, self._x + self._leftExtraIndent, y, _sW=aW-w) File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/flowables.py", line 112, in drawOn self._drawOn(canvas) File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/flowables.py", line 93, in _drawOn self.draw()#this is the bit you overload File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/xhtml2pdf_reportlab.py", line 684, in draw Paragraph.draw(self) File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/reportlab_paragraph.py", line 1144, in draw self.drawPara(self.debug) File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/reportlab_paragraph.py", line 1641, in drawPara dpl(tx, offset, lines[0], noJustifyLast and nLines == 1) File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/reportlab_paragraph.py", line 405, in _justifyDrawParaLineX if last or not nSpaces or abs(extraSpace) <= 1e-8 or line.lineBreak: AttributeError: 'ParaLines' object has no attribute 'lineBreak' ```
timobrembeck commented 1 year ago

The error also appears on some Arabic pages:

  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/integreat_cms/cms/utils/pdf_utils.py", line 115, in generate_pdf
    pisa_status = pisa.CreatePDF(
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/document.py", line 155, in pisaDocument
    doc.multiBuild(context.story)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/doctemplate.py", line 1169, in multiBuild
    self.build(tempStory, **buildKwds)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/doctemplate.py", line 1082, in build
    self.handle_flowable(flowables)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/doctemplate.py", line 931, in handle_flowable
    if frame.add(f, canv, trySplit=self.allowSplitting):
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/frames.py", line 212, in _add
    flowable.drawOn(canv, self._x + self._leftExtraIndent, y, _sW=aW-w)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/flowables.py", line 112, in drawOn
    self._drawOn(canvas)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/reportlab/platypus/flowables.py", line 93, in _drawOn
    self.draw()#this is the bit you overload
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/xhtml2pdf_reportlab.py", line 684, in draw
    Paragraph.draw(self)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/reportlab_paragraph.py", line 1144, in draw
    self.drawPara(self.debug)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/reportlab_paragraph.py", line 1641, in drawPara
    dpl(tx, offset, lines[0], noJustifyLast and nLines == 1)
  File "/opt/integreat-cms/.venv/lib/python3.9/site-packages/xhtml2pdf/reportlab_paragraph.py", line 405, in _justifyDrawParaLineX
    if last or not nSpaces or abs(extraSpace) <= 1e-8 or line.lineBreak:
AttributeError: 'ParaLines' object has no attribute 'lineBreak'
MizukiTemma commented 1 year ago

I've tested some of the not working farsi pages on the current develop branch for PDF export and it works.

The tested pages are the first (عنوان اقامت شهروندان اتحادیه اروپا و کشورهای ثالث) and last (شهروندان اتحادیه اروپا فاقد شغل) under the last parent page with world map icon ( شهروندان اتحادیه اروپا و اتباع کشور های ثالث) under the ID-Card icon (مسائل حقوقی: پناهندگی، اتحادیه اروپا و کشورهای ثالث) in Marburg-Biedenkopf. These pages cannt be successfully exported as PDF, neither individually nor together with other pages.

Though when I copy & past the contents in the current develop branch and export them as PDF, it works.

page overview Export as bundle exported individually

charludo commented 1 year ago

Looks like it's an xhtml2pdf bug. Opened this issue: xhtml2pdf/xhtml2pdf#642 and corresponding PR upstream. I don't think there's anything further we need to do other than wait until the fix makes it into a release.

IDK if this issue should remain open until then?

timobrembeck commented 1 year ago

@charludo Thank you so much!

Hmm, I guess in this case let's leave it open until the problem is fixed for us by updating the library.

osmers commented 4 months ago

Any updates here? What are we waiting for?

timobrembeck commented 4 months ago

@osmers the problem has been fixed in the upstream library xhtml2pdf, but these fixes have introduced a number of even worse bugs (for example page numbering doesn't work anymore and the text lines are reversed on right to left alphabets), so we decided not to update the library until the new bugs have been fixed as well.

osmers commented 4 months ago

@timobrembeck ah alright, thanks for the explanation - I didn't get that from reading the comments :) hopefully it will be fixed soon!