Closed whooooong closed 1 year ago
Hello, did you check the box "Enable Chinese, Japanese and Korean fonts in PDF generation" in your user control panel under the "PDF options" part?
Yes. I did.
It seems to work in dev and in 4.2.4 version. Did you try different pdf readers? I'm using firefox to open the pdf .
I tried PDF-XChange Viewer, firefox, foxit pdf reader and Adobe Acrobat Reader DC. All of these pdf readers show the characters in main text as boxes.
The calibre e-book viewer can show the characters correctly.
As the characters of other regions on the same pdf page are shown correctly, it is probably not a problem with the pdf reader.
Check the box for PDF/A so the font is embeded in the PDF. Then it should show it nicely everywhere.
Also tried, the same, not shown correctly.
Maybe, the main text part is not using correct font.
In the pdf file, the font for the title part is: <</Type /FontDescriptor /FontName /MPDFAA+Sun-ExtA
but the main text part is: <</Type /FontDescriptor /FontName /MPDFAA+DejaVuSansCondensed
But in my example it is in the main part (body), not title. I'm attaching the pdf so you can try to open it, too.
with the dev version: 2022-03-24 - cjk-test.pdf
with 4.2.4: 4.2.4.pdf
Also you said it works with calibre, so the way I see it the pdf is alright, but the software you use doesn't have the correct fonts accessible (is the full operating system in chinese?). This would not be a problem with PDF/A, as the font is embedded. But maybe https://github.com/elabftw/elabftw/issues/3211 will help in this issue.
How do you see which part has which font? It is true that what you describe would explain the issue, but there is no reason for a different font for title and body, as the font family is applied to the whole document.
I found DejaVuSansCondensed in the file: /elabftw/vendor/mpdf/mpdf/src/Config/FontVariables.php Then the characters are shown correctly after DejaVuSansCondensed is replaced by Sun-ExtA: sed -i s/DejaVuSansCondensed.ttf/Sun-ExtA.ttf/g FontVariables.php sed -i s/DejaVuSansCondensed-Bold.ttf/Sun-ExtA.ttf/g FontVariables.php sed -i s/DejaVuSansCondensed-Oblique.ttf/Sun-ExtA.ttf/g FontVariables.php sed -i s/DejaVuSansCondensed-BoldOblique.ttf/Sun-ExtA.ttf/g FontVariables.php
This cannot be reproduced. These changes may make the pdf-making process stuck with a web page showing nothing.
Open a pdf file with notepad++, we can see the font information.
Open a pdf file with notepad++, we can see the font information.
Yeah, that's what I did with vim but I guess I mistyped the search string. Now doing it again, I only have one <</Type /FontDescriptor /FontName /MPDFAA+Sun-ExtA
and no DejaVu anywhere.
Most probably, the font-family styles from the editor make this problem. These styles, if any, override the font-family set in the template.
In the View
→ Source code
, I found there were font-family styles. With the font-family style removed, the characters are shown correctly in pdf. Or, putting sun-exta in the style also works.
Adding a line of $body = preg_replace('#font-family:[^;]+?;#i', '', $body);
before line 259 return str_replace('src="app/download.php?f=', 'src="' . dirname(__DIR__, 2) . '/uploads/', $body);
in the file /elabftw/src/services/MakePdf.php
gets it working as well.
I don't quite like this approach, as generating big zip archives and thus a lot of pdfs are already a big resource hog, and I'm afraid that adding such regex will impact performance too much (of course this would need to be tested!).
Another approach would be to disable the Fonts menu from the editor, what are your thoughts on this option?
Adding such regex is not a good idea, as it removes all font-family settings from the whole body part. The CJK users may also want to keep other font-family settings for non-CJK characters.
Another approach would be to add Sun-ExtA to the Fonts menu, or make it included in font-family style of all lines for CJK users. It works even it is the last one in the font-family style. A check box could be added for choosing a default font to be included as the last one. For non-CJK users, if the sun-exta was the last one of font-family style, it would not be embedded in the pdf, as the fonts ahead is enough to support all the letters.
Without checking PDF/A, the fonts will be embedded as well but only the used subset of the fonts is embedded, the file size could be keep small with Sun-ExtA as only a small subset of the characters are frequently used . A check box could be added to let the users to decide whether to embed the full copy of the entire character set if it is not the whole meaning of PDF/A.
Yes, adding a Sun-ExtA as fallback font could also be a valid approach.
Sounds good. Many thanks!
I'm going to close this. The solution is simply to not use a custom font for letters that are CJK characters.
Describe the bug
In the pdf files generated, the Chinese characters are printed as boxes (▯) in main text, while those in the titles and the names of linked items are printed correctly. An example shown in the web page:
The example shown in the pdf:
Steps to reproduce
Just put the following Chinese characters in the title and main text, and then try to make a pdf: 试试这几个字能否正确显示
Information