Altinn / altinn-pdf

Altinn platform microservice for generating PDFs
0 stars 1 forks source link

PDF generation does not support Sámi characters #19

Closed allinox closed 2 years ago

allinox commented 2 years ago

Description

Our app aims to support several Sámi languages, but when testing with a set of different Sámi characters, the pdf genrated seems to have ignored several of them

To Reproduce

Use a text field in an app Enter the following text:

Nordsamisk: Áá Čč Đđ Ŋŋ Šš Ŧŧ Žž Enaresamisk: Áá Ââ Ää Čč Đđ Šš Žž Skoltesamisk: Áá Ââ Čč Ʒʒ Ǯǯ Đđ Ǧǧ Ǥǥ Ǩǩ Ŋŋ Õõ Šš Žž Åå Ää Lulesamisk i Norge: Áá Ŋŋ Åå Ææ Lulesamisk i Sverige: Áá Ŋŋ Åå Ää Umesamisk: Áá Đđ Ïï Ŋŋ Ŧŧ Úú Åå Ää Öö Sørsamisk: Ïï Ææ Öö Åå

Hit "send" to trigger pdf generation Open the pdf

Expected behavior

I'd expect the pdf to contain the entirety of the text entered, with the correct characters. However, it seems like the generation process doesn't support several of the above characters and instead simply drops them.

Screenshots

image

altinnadmin commented 2 years ago

We're using UTF-8 everywhere, so encoding should be fine.

The missing support is probably related to this code. Seems like we're using an old type 1 font from PDFBox.

https://helpx.adobe.com/fonts/kb/postscript-type-1-fonts-end-of-support.html

Type 1 fonts (also known as PostScript, PS1, T1, Adobe Type 1, Multiple Master, or MM) are a deprecated format within the font industry, replaced by the larger glyph sets and more robust technical possibilities of OpenType format fonts.

acn-sbuad commented 2 years ago

Fix verified. image.png