PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

[CTS 22] Font families #94

Closed PhilterPaper closed 1 year ago

PhilterPaper commented 6 years ago

« May 30, 2017, 04:14:46 AM » by sciurius

When moving into the realm of (basic) typesetting the following challange arises.

Assume we're typesetting a text and the user has selected to use the Times-Roman font. Now a part of the text must be rendered in bold. We humans (nerds) know that we must use the Times-Bold font for this part, but how can we teach PDF::Builder what fonts to use?

PhilterPaper commented 6 years ago

« May 30, 2017, 01:54:21 PM » by Phil

I'm not sure what the context is here. Are you looking to give font changes (and other information) in a high-level language, such as HTML and CSS, and translate it to low-level calls such as $text->font($fTR, $size); (font Times family Roman)? Do you want to have in-line changes to the font used (say, Times-Roman for most, and the occasional Times-Bold and Times-Italic) while letting PDF::Builder do the heavy lifting of fitting a stream of text onto a line (more generally, paragraph shaping)? A word or phrase of bold text might end up being split over a couple of output lines. In-line changes in text size can mean the line's baseline needs to be moved down to accommodate the taller text. Let's not even think about the headaches of mixing RTL and LTR text!

If it's important to avoid HTML and CSS, I suppose you can always fit the output up to the point where the font changes, see how much space you have left and what the next (bold) word will require, and then output the word/split (hyphenate) the word/go to the next line. Some wrappers to track the current line length and try to fit the next word could be useful, but as you've said before, probably don't belong in PDF::API itself. Are you going someplace PDF::Builder-specific with this topic, or does it belong in the Typography and Typesetting board?

PhilterPaper commented 6 years ago

« May 31, 2017, 04:02:48 AM » by sciurius

The context is several of my PDF::Builder-based programs that typeset text. Even though it is not real typesetting (no paragraph adjusting, line folding and so on) occasionally there's a need to have words in bold and italic. In several of these tools, the user can select the font to use for the text. Times-Roman, Arial, etc.. For example, consider this snippet of a (JSON) config file:

"fonts" : {
  "text" : {
    "name": "Times-Roman", "size" : 12
  }
}

The user wants Times-Roman for texts. Assume that in the text there's a part that needs to be italic. What I'm wondering is: Is there a defined and reliable way to get the bold/italic etc. variant of a font, given its (plain) font name.

To make it more complex, the user can also designate the font by file name in which case inferrence by name is not possible. So I think that in the end it will be neccessary to explicitly specify a collection of fonts, e.g.

"fonts" : {
  "text" : {
    "name": "Times-Roman", "size" : 12
  },
  "text-italic" : {
    "name": "Times-Italic", "size" : 12
  }
}

However my feeling is that somehow this should be controlled external to the application similar to the way a text processor picks up the system fonts.

PhilterPaper commented 6 years ago

« May 31, 2017, 11:18:52 AM » by Phil

Quote from: sciurius on May 31, 2017, 04:02:48 AM

What I'm wondering is: Is there a defined and reliable way to get the bold/italic etc. variant of a font, given its (plain) font name.

I'm not aware of any algorithm to derive variants of a font from some base name. I think you're just going to have to explicitly list somewhere the related fonts for a typeface, e.g., Times-Roman, Times-Bold, Times-Italic, Times-Bold-Italic, plus any specialty fonts such as small caps, condensed, expanded, etc. Don't forget that "italic" is "slanted" or "oblique" in a number of fonts. Other aspects of names are not standardized.

However my feeling is that somehow this should be controlled external to the application similar to the way a text processor picks up the system fonts.

Would it be useful for PDF::Builder to build in a font lookup tailored for each typeface? Something like a $font->variants() returning a hash of 'regular', 'italic', 'smcap', etc. (where available) font names to use? Would that cover everything? PDF::Builder could cover the standard fonts available to everyone, and a user or system administrator would have to manually add additional entries as fonts were added to the system. Consider, for ease of use and minimized confusion, matching attribute names and values used in CSS for font-weight, font-stretch, font-style, font-variant, etc..

In the examples/ library, there are some programs that show fonts beyond the standard 4 flavors (e.g., small caps). I haven't looked at them in detail, so I can't tell you right now if these variants are included in the font itself, or the code is modifying fonts on the fly (I suspect the former).

PhilterPaper commented 6 years ago

« June 01, 2017, 04:23:41 AM » by sciurius

Quote from: Phil on May 31, 2017, 11:18:52 AM

I'm not aware of any algorithm to derive variants of a font from some base name. I think you're just going to have to explicitly list somewhere the related fonts for a typeface, e.g., Times-Roman, Times-Bold, Times-Italic, Times-Bold-Italic,

Precisely. Except that there are tools/libraries that already do that. I don't think that every word processing / drawing / design tool has private implementations.

Would it be useful for PDF::Builder to build in a font lookup tailored for each typeface?

Consider Mac/OSX. It has a tool FontBook to install and maintain font collections. Once installed, the fonts are available to all applications and I'm pretty sure this includes things like getting a bold variant of a given font. Linux has Fontconfig/Xft. We should (must) use these tools.

PhilterPaper commented 6 years ago

« June 01, 2017, 11:10:19 AM » by Phil

These tools are built into Mac and Linux, and there is a ready interface so we could get to it from Perl? How about Windows? I don't think there's anything like that built in, and Windows is too big a market to ignore. There appear to be lots of aftermarket utilities, but they seem to be oriented towards displaying and comparing typefaces/fonts, rather than an API for "I want to use the FlugoFont typeface... what variants are supported?". It would be nice to use Mac and Linux built-in interfaces to get this information, if it can also be done for Windows without great pain. For Windows, we may have to manually support a properties list as a .pm file that a user would have to edit to add or delete fonts.

While on the subject, it's possible that not every typeface glyph is going to be available in every font. I recall you requested something to find and use a fallback font if a desired glyph wasn't available in the desired font. How would such a capability fit in with an API that tells you what variants are available for a given typeface? Furthermore, what about synthesizing bold by overprinting with offset and SMALL CAPS by size change? If the desired typeface doesn't offer it natively, it might be better than nothing. Oblique could also be done by skewing the coordinate system.

Anyway, let's first think about the things we need to do with fonts, and try to come up with something which kills many birds with one stone, rather than doing all sorts of independent projects that don't fit together well.

PhilterPaper commented 6 years ago

« July 22, 2017, 08:02:56 PM » by Phil

https://xmlgraphics.apache.org/fop/1.0/fonts.html is some interesting reading. By the way, regarding core fonts, it states:

Please note that recent versions of Adobe Acrobat Reader replace "Helvetica" with "Arial" and "Times" with "Times New Roman" internally. GhostScript replaces "Helvetica" with "Nimbus Sans L" and "Times" with "Nimbus Roman No9 L". Other document viewers may do similar font substitutions. If you need to make sure that there are no such substitutions, you need to specify an explicit font and embed it in the target document.

Also note that in addition to the 14 required core fonts (Roman/Upright, Italic/Oblique/Slanted, Bold, and Bold-Italic variants of Times, Helvetica, and Courier; along with Zapf Dingbats and Symbol), PDF::Builder includes Georgia, Trebuchet, and Verdana (each in the four variants); Bank Gothic, Webdings and Wingdings. Then there are 5 CJK fonts supplied (or at least, their metrics are): Song, Ming, Myungjo, KozGoPro, and KozMinPro. Finally, individual document creators can supply their own fonts. Operating systems tend to come with a large supply of built-in fonts which may or may not be available to PDF readers.

PhilterPaper commented 6 years ago

« August 23, 2017, 10:29:33 AM » by sciurius

Nitpicking time :) .

PDF::Builder has built-in metrics for all these fonts, but it depends on the PDF viewer which fonts are actually used.

PhilterPaper commented 6 years ago

« August 23, 2017, 11:08:06 AM » by Phil

Well, at least it isn't nose-picking time!

OK, as a clarification, PDF::Builder includes metrics for a number of fonts, but not the font files themselves. That depends on the OS and anything the PC owner has installed.

Regarding substitutions, if, for instance, Adobe [Acrobat] Reader decides to substitute Arial for Helvetica, are the metrics exactly the same, or just "close"? If PDF::Reader (or text processing based on it) has carefully justified a line of type in Helvetica, will the Arial text fit exactly the same (or at least, no noticeable difference)?

And what happens when I request, say, Bank Gothic and the font itself isn't available on the reader (but the metrics are, so the PDF file generation is successful)? Will the reader try to substitute something else, or just give a lot of errors?

PhilterPaper commented 6 years ago

« August 25, 2017, 02:42:58 AM » by sciurius

It is best to think in terms of producers and consumers. A producer needs font metrics. If it uses a font file, it does so for the metrics that are in the file. PDF::Builder is a producer. A consumer needs fonts glyphs and metrics. Acrobat Reader is a consumer.

Regarding substitutions, the consumer will substitute a font that has similar appearance (e.g., serif versus sans, italic, ...) and near identical metrics. IIRC, when Microsoft developed Arial, they copied the metrics of Adobe Helvetica. The same goes for many other fonts. For example, Nimbus Roman has idential metrics as Adobe Times-Roman.

In general, the substitution will not be noticed.

See also the attachment. screenshot at 2017-08-25 08-41-59 png_thumb

PhilterPaper commented 6 years ago

« October 25, 2017, 11:10:10 AM » by Phil

Also look at GitHub issue #56. This is a request for automatically switching fonts when a requested glyph is missing from the specified font. There may be enough commonality between the two that they should be considered and designed together.

By the way, does PDF::Builder have any support for embedding a font in the PDF file? I don't recall ever running across such a thing. TTF support does have a -noembed flag, which needs to be looked at (implies that embedding is automatic). If the code already exists, it might be copied to Type1 support, and possibly core fonts beyond the Standard 14.

Update: Issue RT 123470 (#80) is a request for Embedded Fonts.

PhilterPaper commented 5 years ago

Just a comment on the previous post: PDF::Builder does automatically embed TTF/OTF fonts, and does subset them (provide only the glyphs used). Both actions can be shut off with -noembed and -nosubset options. Adobe suggests providing the full set of glyphs (use -nosubset) if you're producing a document that might need to be updated or edited at the other end (and thus potentially need more glyphs). An alternative to this might be to add code to force at least the basic ASCII character set in the embedded font. Some fonts are not licensed to embed the full thing, just a subset.

PhilterPaper commented 4 years ago

Update: Johan (@sciurius) has released Text::Layout, which includes Pango-style selection of fonts and sizes. It basically looks like a subset of HTML. I need to provide a new back end to it, rather than directly outputting to the PDF file. In the future, PDF::Builder might include other HTML tags (and CSS settings for them) to handle high-level formatting, but for now just font-switching should be quite useful.

PhilterPaper commented 1 year ago

This is basically taken care of by the Font Manager, so I might as well close this issue.