Open GrahamHannington opened 2 years ago
@GrahamHannington thank you for the continued feedback on this issue.
The addition of APL symbols is on the list of future updates but there are still other priorities at the moment. As external contractors we are dependent on the budgets that are allocated to us by IBM. I’m sure you understand.
we are dependent on the budgets that are allocated to us by IBM
Sure, understood. I like what you've done with that budget; I like IBM Plex!
I feel bad using Noto Sans Mono for some 3270 screens when I'd prefer to be using IBM Plex Mono.
Re:
The addition of APL symbols
Just so that we're on the same, er, code page :wink:: the term "APL symbols" might mean different things to different people.
Here, I'm specifically requesting the characters in EBCDIC code page 310 (see the link in my original post).
In Unicode terms, some of the characters in that code page are characterized as APL functional symbols, some are not; some are in the "Miscellaneous Technical" Unicode block, some are not.
(I've just noticed issue #176, opened in 2018.)
To highlight the missing characters, I copied the HTML for that table of EBCDIC code page 310 from Wikipedia, and tweaked the CSS for the sample characters to font-family: "IBM Plex Mono", "Adobe NotDef"
, so that any characters not present in IBM Plex Mono would fall back to the Adobe NotDef glyph.
That's a whole lotta tofu. :wink:
I attached a .zip of the HTML because the tooltips show the Unicode character names and code points.
Mousing over anonymous tofu gets tired real quick, so I tweaked the CSS some more, including this to expose the tooltips:
td::after {
font-size: x-small;
content: attr(title) " ";
}
2-page PDF, attached.
Hi @BoldMonday,
I'm not trying to rub it in (the difference in coverage); I actually thought you might find this useful.
Flipping between the two pages highlights the difference between the tofu in IBM Plex Mono versus the glyphs in Noto Sans Mono.
Hi @BoldMonday ,
I'm sorry.
I've previously referred you to a table in the Wikipedia article "Digital encoding of APL symbols".
That table maps 3270 characters (specifically, characters in EBCDIC code page 310) to Unicode characters.
:warning: I've just discovered that some of those mappings are incorrect. At the very least, incorrect in the context of IBM 3270 terminal displays.
Unfortunately, I don't have a direct, correct replacement table to offer you: that is, a table that shows the correct glyphs and corresponding Unicode code points.
Frankly, I'm still digesting this news myself.
Earlier this week, I saw a 3270 screen that contains the character with EBCDIC code page 310 byte value X'81', which IBM characterizes as "Double Vertical, Bar Graphic", GCGID SF630000.
That on-screen glyph is significantly different to the glyph shown in the table in Wikipedia.
The lines in the on-screen glyph are as far apart as possible, whereas the lines in the glyph in the table in Wikipedia are closely spaced.
This prompted me to add a section to the "Talk" page of that Wikipedia article, "Incorrect mapping of EBCDIC code page 310 (APL) to Unicode characters?"
Another Wikipedia user replied:
IBM actually maps SF630000 (the 0x81, double vertical one) to U+F892 in their corporate Private Use Area scheme, and SF620000 (the 0x82, double horizontal one) to U+F893, also in the Private Use Area (as seen in unicode.nam, included here). In terms of more recent additions to Unicode that the cited sources did not have the benefit of, ... (U+1FB80) in the Symbols for Legacy Computing block is a much closer match to the double horizontal one, but there is still no particularly good match to the double vertical one
That prompted me to do more research.
While I don't have a direct replacement for that table in Wikipedia, I can offer you:
For example, you can see from these tables that EBCDIC code page 310 byte value X'81' ("Double Vertical, Bar Graphic", GCGID SF630000) maps not to U+2551, but to the PUA code point U+F892, and that the lines in the glyph are spaced as far apart as possible, which is significantly different to U+2551.
I don't know.
Certainly, these two EBCDIC code page 310 byte values:
Do other characters in the character set (GCSGID 00963) for EBCDIC code page 310 also map to IBM PUA characters? I don't know. Given the available information, I think it's possible to answer this question, but I acknowledge that I'm currently not thinking clearly enough to work out an efficient method to do that. I need more coffee, or more sleep. 🙂
To properly support 3270 screens, IBM Plex Mono will need to include characters in the IBM PUA.
Mapping to "standard" vs "PUA" characters can significantly affect the appearance, even usability, of a 3270 screen.
Example: EBCDIC code page 310 byte value X'81':
When used as a table column separator, to distinguish, say, non-scrollable columns from scrollable columns, U+2551 gives characters in adjoining table cells some breathing space; U+F892 does not. Arguably, then, U+2551 is usable in this context, but not U+F892.
This is my current best effort at identifying the IBM PUA characters in EBCDIC code page 310:
EBCDIC code page 310 byte value (hex) | GCGID | GCGID name | IBM PUA Unicode code point (U+) |
---|---|---|---|
55 | LN480000 | N Line Below Capital/N Underscore (APL) | F8D7 |
56 | LO480000 | O Line Below Capital/O Underscore (APL) | F8D5 |
57 | LP480000 | P Line Below Capital/P Underscore (APL) | F8D3 |
58 | LQ480000 | Q Line Below Capital/Q Underscore (APL) | F8D1 |
59 | LR480000 | R Line Below Capital/R Underscore (APL) | F8CF |
62 | LS480000 | S Line Below Capital/S Underscore (APL) | F8CD |
63 | LT480000 | T Line Below Capital/T Underscore (APL) | F8CB |
64 | LU480000 | U Line Below Capital/U Underscore (APL) | F8C9 |
65 | LV480000 | V Line Below Capital/V Underscore (APL) | F8C7 |
66 | LW480000 | W Line Below Capital/W Underscore (APL) | F8C5 |
67 | LX480000 | X Line Below Capital/X Underscore (APL) | F8C3 |
68 | LY480000 | Y Line Below Capital/Y Underscore (APL) | F8C1 |
69 | LZ480000 | Z Line Below Capital/Z Underscore (APL) | F8BF |
80 | SL460000 | Tilde (APL) | F88F |
81 | SF630000 | Double Vertical, Bar Graphic | F892 |
82 | SF620000 | Double Horizontal, Bar Graphic | F893 |
85 | SF660000 | Center Vertical, Bar Graphic | F891 |
8A | SL610000 | Up Arrow (APL) | F88B |
8B | SL620000 | Down Arrow (APL) | F88A |
8F | SL600000 | Right Arrow (APL) | F88C |
9D | SL080000 | Circle (APL) | F890 |
9F | SL590000 | Left Arrow (APL) | F88D |
A4 | LN012000 | n Small Subscript | F8D8 |
B7 | SL640000 | Slope (APL) | F889 |
DB | SL580000 | Quote Dot (APL) | F88E |
To create this table, I used Excel to correlate unicode.nam
with CP00310.txt
(ftp://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP00310.txt).
I've yet to see any "Line Below Capital" characters on a 3270 screen. Then again, I've never programmed in APL.
I'm curious to know how we got here: Unicode contains characters for Ancient Greek Musical Notation, but not comprehensive support for all 3270 characters. I can imagine some reasons, but I'd be interested to know the real story.
I wrote:
I've yet to see any "Line Below Capital" characters on a 3270 screen.
This irked me.
Today, based on code provided to me by a vastly more experienced colleague, I wrote a z/OS REXX exec that dynamically generates a 3270 screen (specifically, an ISPF panel) that shows all of the PUA characters listed in my previous comment.
Here's an image of the screen:
The glyphs in white, in the first column, are provided by a proprietary (non-Unicode) font that is supplied with the terminal emulator. Everything else is set in IBM Plex Mono. The proprietary font doesn't necessary have the same font metrics (e.g. glyph widths) as IBM Plex Mono, but the method that the emulator uses to position the characters means that this doesn't matter; it doesn't affect the alignment of the screen contents.
I think that some, perhaps even most, of these characters could be mapped to existing equivalent characters in the Unicode standard. I'd like to know why IBM chose to map such characters to PUA code points instead of existing standard code points. Perhaps the answer is in the qualifier "existing"; perhaps IBM made that decision before such characters were in the standard. I'm just guessing. I'd really like to understand the history here. If you have that conversation with IBM, I'd be grateful if you share what you can.
I'm unaware of any Unicode font that includes all of these characters (at these PUA code points).
I wonder about:
Adding IBM PUA characters to an open-source font.
Then again: if not in IBM Plex, then where? Why use the "IBM" qualifier in the name if you're not going to include IBM PUA characters?
Proposing to Google Fonts that they include these IBM PUA characters in a variant of the Noto Sans Mono font; perhaps, "Noto Sans Mono IBM" (or "...3270", to avoid trademark issues), to use as a fall-back font for presenting 3270 screens, when Noto Sans Mono lacks the IBM PUA characters. However, from the Google Fonts "Contribute to Noto fonts" topic:
If you're proposing design for new codepoints, those need to already exist in the Unicode Standard. Google Fonts does not accept proposals for scripts that are not part of Unicode.
I'm not really proposing a "script" as such; although, yeah, these code points definitely aren't in the standard, in the sense that they're in the PUA.
Just how many of these PUA characters truly warrant a PUA code point?
That table I cited previously in Wikipedia does a pretty good job of mapping to standard Unicode characters. Just X'81' (U+F892) and X'82' (U+F893)? I'll admit, I haven't (yet!) diligently explored the Unicode standard for matching characters for all of these.
Can I synthesize U+F892 and U+F893 without having those specific glyphs available in a font; say, via CSS border properties, or by superimposing existing characters?
(Just because I have an idea, doesn't mean I like it. 😉)
Hi @BoldMonday,
In issue #93, you commented:
I gather this means that, when IBM briefed you about requirements and use cases for IBM Plex Mono, they didn’t even mention 3270 terminals.
And yet, IBM are using IBM Plex Mono to present 3270 terminal screen captures in IBM product docs! Apparently, without considering whether the font is fit for purpose; that is, whether it supports all of the required characters.
It doesn't. So, for some 3270 screens, documentation writers must resort to using bitmapped screen captures.
You've recently added box-drawing characters (issue #93). That's great, thanks!
However, as I commented in issue #93, box-drawing characters are only a subset of the APL characters that 3270 terminal screens can display. See EBCDIC code page 310 in the Wikipedia article "Digital encoding of APL symbols".
For more details, I recommend that you contact IBM. They are the 3270 experts, and they made the decision to use IBM Plex Mono to present 3270 screen captures in their product docs.