Closed MrVauxs closed 1 year ago
Embedding means very little the MAP needs to work so 1000001 = 65 = � if not mapped to a character such as say an Arabic character in one font or with an accent in a Baltic font. PDF is painting by numbers but if the colour range is wrong you get muddy brown outs.
I am not sure what you are talking about. I was just wondering why Chrome lets me copy the highlighted character as [free-action]
while Sumatra returns a �
.
It depends what your capturing and viewing so here without your sample WordPad can only substitute © for the picture, With the file it would be simpler to see if there is some underlying [free-action] #tag since thats what chrome is showing as Tab [file-name]
Here is a single-page excerpt from the PDF Treasure Vault-pages-13.pdf
OK thanks for a sample this is a MuPDF limitation, perhaps best raised there @kjk can SumatraPDF do better in such cases ?
Pathfinder-Icons (TrueType (CID); Identity-H; embedded)
ADVANCING RUNE 9+
MAGICALNECROMANCY
Usage etched onto heavy armor
This rune charges up as you defeat your foes, driving you
forward across the battlefield with every victory.
Activate [free-action] command; Requirements Your last action or
...
ADVANCING
RUNE 9+
MAGICAL
NECROMANCY
Usage etched onto heavy armor
This rune charges up as you defeat your foes, driving you
forward across the battlefield with every victory.
Activate � command; Requirements Your last action or
...
So these are custom named characters
76 0 obj
<<
/Length 626
>>
stream
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (Adobe)
/Ordering (UCS) /Supplement 0 >> def
/CMapName /Adobe-Identity-UCS def
/CMapType 2 def
1 begincodespacerange
<0000> <FFFF>
endcodespacerange
5 beginbfchar
<0013> <005B00740068007200650065002D0061006300740069006F006E0073005D>
<0014> <005B00740077006F002D0061006300740069006F006E0073005D>
<0015> <005B006F006E0065002D0061006300740069006F006E005D>
<0016> <005B0066007200650065002D0061006300740069006F006E005D>
<0017> <005B007200650061006300740069006F006E005D>
endbfchar
endcmap CMapName currentdict /CMap defineresource pop end end
endstream
So yeah, this is MuPDF's fault. @MrVauxs, do you feel like creating a bug at their Bugzilla?
I have a fix for this (attached), which I'll gladly send to the MuPDF bug and maybe they'll use it. Alternatively @kjk can apply it locally if he feels like it. I don't have a good feeling about how often this may be an issue; it needs to be balanced against later living with another local divergence from MuPDF upstream.
Basically CMaps can map a single character into a string, like here to [free-action], but MuPDF has a length limit of 8 for that string, and those longer are ignored. Once the limit is raised (I raised it to 32), it still fails due to careless coding, the limit is hardcoded in a few other places instead of taken from the #define. The patch fixes all that.
First time using BugZilla, but there you go: https://bugs.ghostscript.com/show_bug.cgi?id=706498
Good news, mupdf fixed this.
Paizo's Pathfinder 2e booklets use a special font, which when copied in say, Chrome, returns a string such as
[free-action]
,[reaction]
,[one-action]
, etc. Unfortunately with Sumatra, these characters are instead copied as�
,�
, and�
in that order. The fonts are embedded in the PDF document.