PDF Character encoding problem - Githubissues

Syslifters / sysreptor

Fully customisable, offensive security reporting solution designed for pentesters, red teamers and other security-related people alike.

https://docs.sysreptor.com

Other

1.35k stars 132 forks source link

PDF Character encoding problem #316

Open c3sro opened 2 weeks ago

c3sro commented 2 weeks ago

Some strings like ff in the final PDF file will result in e.g. 昀昀 when copied from the PDF, but are still displayed as expected as ff.

Steps to reproduce:

Enter Text in some chapter

  Test

  ff

  fi

  xxxffxxx

  xxxfixxx

Switch to Publish to generate PDF (Preview display still looks good, copy + paste work as expected)
Download PDF and open it in browser/PDF editor

Expected Result: Text copied from the PDF is exactly the same as entered previously.

Actual Result: Some characters don't match previously entered values.

Observed transformations: ff->昀昀 fi -> 昀椀

MWedl commented 2 weeks ago

Thanks for reporting. We will investigate this.

aronmolnar commented 1 week ago

PDFs that are displayed in the "Publish" menu are not compressed to save time. If you download the PDF, there is a postprocessing step to reduce the file size using Ghostscript.

The error that you observed is due to a bug in Ghostscript, which was fixed in the meantime but not yet available in Debian repos (from where we currently install Ghostscript).

You can disable the postprocessing step by setting COMPRESS_PDF=false in your app.env as a workaround.

We will check what other installation options we have to resolve the bug.