It used to be the norm to expand uppercase umlauts: Ae for Ä, Oe for Ö and Ue for Ü. Most place names still follow this convention, Österreich (Austria) being a notable exception. This is not possible to achieve with text-transform yet.
Elsewhere, uppercase umlaut letters have the diaeresis dots spaced wider (in Ä and Ö) or narrower (in Ü) and lower, such that they fit within the cap height. This local character style has no designated Opentype feature, but is sometimes available as cv## variant or as part of a ss## stylistic set or as a part of locl localized glyphs shapes. A standardized way to access vertically compressed diacritic marks on uppercase letters (and perhaps lowercase letters with ascenders) would be nice and might benefit other languages as well. (Some word processors have an option to discard diacritic marks on uppercased letters in French, for instance.)
Eszett ß is usually uppercased as SS, sometimes as its newish uppercase variant and, at least historically, sometimes as SZ. This latter one is not usually available as an option.
Hyphenation
In compounds, which are more ubiquitous in German than in other writing systems employing the roman script, the preferred hyphenation point is at the semantic and morphological boundary, e.g. Rinder-braten from Rind + Braten. With short derivational affixes, this can be difficult to detect for algorithms and may lead to comical effects, e.g. Ur-insekt (base insect) vs. Urin-sekt (urine champagne).
Compounds that are tied together with a hyphen should preferably be broken thereafter, but may be hyphenated elsewhere as well, especially at other semantic boundaries and as far away from the other hyphen as possible.
Morphological breaks are traditionally preferred as hyphenation opportunities over phonological ones, even if both are accepted, e.g. Ma-gnet vs. Mag-net. It’s not possible yet to generally favor one approach over the other.
Linebreaking
In incomplete sentences, as often encountered in headings and bullet points, line breaks are sometimes preferred to occur after punctuation marks like commas and colons, marking logical breaks (but this is often hard to distinguish from commas between enumerated items).
Ligatures
When letter-spacing for emphasis (as is common in blackletter texts), some digraphs should not be broken up, even if they are not forming proper ligatures, e.g. ch, tz, ſt. Since texts and fonts cannot be relied upon to include the necessary markup or substitution features, a high-level control would be helpful.
Blackletter
For stylistic effect, it is sometimes desired to be able to use some blackletter typeface, but no particular one (script code Latf), like Fraktur or Schwabacher. A generic font family name would help.
Cursive
Besides decimal numeric, roman alphabetic, greek alphabetic and roman numerals, some scholarly authors during the 20th century fancied German cursive lowercase letters for list counters. It’s unclear whether U+1D4B6 etc. or U+1D51E etc. would be appropriate. Ready-made Counter Styles does not support either of these yet.
Random notes on German Gap Analysis
Capitalization
It used to be the norm to expand uppercase umlauts: Ae for Ä, Oe for Ö and Ue for Ü. Most place names still follow this convention, Österreich (Austria) being a notable exception. This is not possible to achieve with
text-transform
yet.Elsewhere, uppercase umlaut letters have the diaeresis dots spaced wider (in Ä and Ö) or narrower (in Ü) and lower, such that they fit within the cap height. This local character style has no designated Opentype feature, but is sometimes available as
cv##
variant or as part of ass##
stylistic set or as a part oflocl
localized glyphs shapes. A standardized way to access vertically compressed diacritic marks on uppercase letters (and perhaps lowercase letters with ascenders) would be nice and might benefit other languages as well. (Some word processors have an option to discard diacritic marks on uppercased letters in French, for instance.)Eszett
ß
is usually uppercased asSS
, sometimes as its newish uppercase variant and, at least historically, sometimes asSZ
. This latter one is not usually available as an option.Hyphenation
In compounds, which are more ubiquitous in German than in other writing systems employing the roman script, the preferred hyphenation point is at the semantic and morphological boundary, e.g. Rinder-braten from Rind + Braten. With short derivational affixes, this can be difficult to detect for algorithms and may lead to comical effects, e.g. Ur-insekt (base insect) vs. Urin-sekt (urine champagne). Compounds that are tied together with a hyphen should preferably be broken thereafter, but may be hyphenated elsewhere as well, especially at other semantic boundaries and as far away from the other hyphen as possible.
Morphological breaks are traditionally preferred as hyphenation opportunities over phonological ones, even if both are accepted, e.g. Ma-gnet vs. Mag-net. It’s not possible yet to generally favor one approach over the other.
Linebreaking
In incomplete sentences, as often encountered in headings and bullet points, line breaks are sometimes preferred to occur after punctuation marks like commas and colons, marking logical breaks (but this is often hard to distinguish from commas between enumerated items).
Ligatures
When letter-spacing for emphasis (as is common in blackletter texts), some digraphs should not be broken up, even if they are not forming proper ligatures, e.g.
ch
,tz
,ſt
. Since texts and fonts cannot be relied upon to include the necessary markup or substitution features, a high-level control would be helpful.Blackletter
For stylistic effect, it is sometimes desired to be able to use some blackletter typeface, but no particular one (script code
Latf
), like Fraktur or Schwabacher. A generic font family name would help.Cursive
Besides decimal numeric, roman alphabetic, greek alphabetic and roman numerals, some scholarly authors during the 20th century fancied German cursive lowercase letters for list counters. It’s unclear whether U+1D4B6 etc. or U+1D51E etc. would be appropriate. Ready-made Counter Styles does not support either of these yet.