mgieseki / dvisvgm

A fast DVI, EPS, and PDF to SVG converter
https://dvisvgm.de
GNU General Public License v3.0
295 stars 28 forks source link

Missing text from PDF >> SVG #252

Closed povpie closed 1 month ago

povpie commented 8 months ago

Version:3.12 Configuration: dvisvgm --pdf -Oall -fwoff2 Problem: missing some text on svg.

File: test1.pdf


(additional files)
test2.pdf test3.pdf

mgieseki commented 8 months ago

I can't reproduce the problem. What version of dvisvgm and mutool do you use? Please post the output of dvisvgm -V1.

povpie commented 8 months ago

dvisvgm 3.1.2 (x86_64-pc-win64)

brotli: 1.1.0 clipper: 6.2.1 freetype: 2.13.2 Ghostscript: 9.25 MiKTeX: 22.12 mutool: 1.21.0 potrace: 1.16 xxhash: 0.8.2 zlib: 1.3

Just realized that many are outdated. I'll try to update them.

I updaded mutool successfully but ghoscript and miktex are still showing as the old version when i run dvisvgm -V1 , I already changed the environmental variable path to the correct folders. Am I missing something? Thanks Martin.

mgieseki commented 8 months ago

Ok, thanks for the additional info. I was able to reproduce the issue now. Unfortunately, it's related to the limited functionality available via mutool. The PDF file contains four different font resources that all have the same internal name PCPYGD+-:

Fonts (4):
        1       (7 0 R):        Type0 'PCPYGD+-' Identity-H (11 0 R)
        1       (7 0 R):        Type0 'PCPYGD+-' Identity-H (12 0 R)
        1       (7 0 R):        Type1 'PCPYGD+-' WinAnsiEncoding (8 0 R)
        1       (7 0 R):        Type1 'PCPYGD+-' WinAnsiEncoding (13 0 R)

Therefore, it's not possible to identify the different fonts by their name which is essential for dvisvgm to work properly. Maybe you can tweak the font embedding options of the application used to create the PDF files in order to get more distinct names when subsetting fonts.

povpie commented 8 months ago

I'll try to change it manually. Is is that mutool doesn't identify the ID following the name font? I found this link (looks like the same issue): https://github.com/pymupdf/pymupdf/issues/2110#issuecomment-1343318360

Here are the fonts identified on test1 with an online font downloader: Screenshot (2)

mgieseki commented 8 months ago

Is is that mutool doesn't identify the ID following the name font?

In the PDF file, there are no numbers appended to the font names. They are probably added by your font downloader. As shown above, all four font objects got the name PCPYGD+-. Internally they can be distinguished by their object IDs but mutool doesn't provide a way to make them accessible to the user in the backend. Fonts are referenced there only by their names which might be ambiguous, like in your case.

povpie commented 8 months ago

Ah, i see. If I find a solution I'll post it here. Unfortunately there's no option to export the design in different way for fonts on Adobe Express.