Closed larsbrinkhoff closed 7 years ago
@lokedhs is working on this.
I have found five places that define the SAIL/ITS character set.
The first three mostly agree with each other. The last two are identical, but different from the first.
I made a table to show where they differ.
OCTAL | ASCII 1963 | CHAR SAIL | SailDart | RFC 698 | SUPDUP | ASCII 1967 |
---|---|---|---|---|---|---|
000 | · | · | ||||
011 | γ | γ | ||||
012 | δ | δ | ||||
013 | ∫ | ↑ | ||||
014 | ± | ± | ||||
015 | ⊕ | ⊕ | ||||
026 | ⊕ | ⊗ | ⊗ | ⊗ | ||
030 | _ | _ | _ | ← | ||
032 | EOF | ~ | ~ | ≠ | ||
033 | ≠ | ≠ | ≠ | ◊ | ||
136 | ↑ | ↑ | ↑ | ↑ | ^ | ^ |
137 | ← | ← | ← | ← | _ | _ |
175 | ALTMODE | ⎇ | ◊ | } | } | |
176 | ESC | } | } | } | ~ | ~ |
177 | DEL | ␈ | ^ | ∫ | DEL |
So I'm guessing the SAIL charset is slightly different from the ITS charset.
Commit 1b17f0d71fbaf934efeb1bc9bcbb8b089e51df61 implements these changes. I also enabled %TOSA1 which causes these characters to be displayed properly.
With the latest version of the unicode support, most things work, but there are some strange behaviours and I can't really figure out what the cause is. In particular, The C-x
character is mapped to left-arrow. If I type C-x
twice, I get the following output: ↑X↑X
. Pressing backspace once, deletes this and replaces it by a single ←
.
Note that both C-x
and _
are both accepted by ITS when used as a redirection character.
There is some good information here, in particular about the 1963 and 1967 versions of ASCII:
http://worldpowersystems.com/J/codes/
Note that the 1963 version had ↑ and ← instead of ^ and _. Since the PDP-6 was born in 1963, it would have used the old glyphs in its software and hardware peripherals. ITS was first implemented on the PDP-6, and became operational in 1967. I suppose it didn't adapt immediately to the revised ASCII definition, so the old glyphs probably lingered on for a while. As new peripherals using the updated glyphs were attached, it probably became acceptable to view ↑/^ and ←/_ as interchangeable.
One important perihperal was the Knight TV terminals. They were the native and canonical ITS terminals for a long time. I wonder which character set they used?
@ams provided information about the CONS/CADR character set. For the codes 000-176, it's identical to SUPDUP. 177 is NUL, not integral.
Does that mean that with the exception of 177, the current version is correct?
Erm, sorry. That was a cut and paste error. 177 is integral.
See http://bitsavers.trailing-edge.com/pdf/mit/cadr/chinual_6thEd_Jan84/chineualJun84_10_ChrsAndStrs.pdf for the list. Page 7 contains the char. set.
@lokedhs I think you map 173-176 to the SAIL charset which is different from ITS.
Does that mean that with the exception of 177, the current version is correct?
Yes.
@larsbrinkhoff what should 173-176 be mapped to then?
Same as SUPDUP above: { | } ~. Well, 173 and 174 are already ok, but it kinda looks like they get some special treatment.
For reference, this is the Lisp Machine char. set:
000 center-dot 040 space 100 @ 140 `
001 down arrow 041 ! 101 A 141 a
002 alpha 042 " 102 B 142 b
003 beta 043 # 103 C 143 c
004 and-sign 044 $ 104 D 144 d
005 not-sign 045 % 105 E 145 e
006 epsilon 046 & 106 F 146 f
007 pi 047 ' 107 G 147 g
010 lambda 050 ( 110 H 150 h
011 gamma 051 ) 111 I 151 i
012 delta 052 * 112 J 152 j
013 uparrow 053 + 113 K 153 k
014 plus-minus 054 , 114 L 154 l
015 circle-plus 055 - 115 M 155 m
016 infinity 056 . 116 N 156 n
017 partial delta 057 / 117 O 157 o
020 left horseshoe 060 0 120 P 160 p
021 right horseshoe 061 1 121 Q 161 q
022 up horseshoe 062 2 122 R 162 r
023 down horseshoe 063 3 123 S 163 s
024 universal quantifier 064 4 124 T 164 t
025 existential quantifier 065 5 125 U 165 u
026 circle-X 066 6 126 V 166 v
027 double-arrow 067 7 127 W 167 w
030 left arrow 070 8 130 X 170 x
031 right arrow 071 9 131 Y 171 y
032 not-equals 072 : 132 Z 172 z
033 diamond (altmode) 073 ; 133 [ 173 {
034 less-or-equal 074 < 134 \ 174 |
035 greater-or-equal 075 = 135 ] 175 }
036 equivalence 076 > 136 ^ 176 ~
037 or 077 ? 137 _ 177 integral
200 Null character 210 Overstrike 220 Stop-output 230 Roman-iv
201 Break 211 Tab 221 Abort 231 Hand-up
202 Clear 212 Line 222 Resume 232 Hand-down
203 Call 213 Delete 223 Status 233 Hand-left
204 Terminal escape 214 Page 224 End 234 Hand-right
205 Macro/backnext 215 Return 225 Roman-i 235 System
206 Help 216 Quote 226 Roman-ii 236 Network
207 Rubout 217 Hold-output 227 Roman-iii
237-377 reserved for the future
The Lisp Machine Character Set
(all numbers in octal)
The Knight TV source file SYSTEM; TV 132 has the font. And it's identical to the SUPDUP and CADR charset. I'd say that's rather conclusive.
Fixed by #10.
@ams, I don't see many of the the mathematical and APL symbols in the above list.
That is right. One would switch the displayed char. set using a special control char. sequence.
Here's a mapping between the SAIL charset and Unicode:
http://www.saildart.org/allow/sail-charset-utf8.html