PDP-10 / supdup

Community maintained SUPDUP client for Unix
Other
16 stars 8 forks source link

Output SAIL charset as Unicode #3

Closed larsbrinkhoff closed 7 years ago

larsbrinkhoff commented 7 years ago

Here's a mapping between the SAIL charset and Unicode:
http://www.saildart.org/allow/sail-charset-utf8.html

larsbrinkhoff commented 7 years ago

@lokedhs is working on this.

larsbrinkhoff commented 7 years ago

I have found five places that define the SAIL/ITS character set.

The first three mostly agree with each other. The last two are identical, but different from the first.

larsbrinkhoff commented 7 years ago

I made a table to show where they differ.

OCTAL ASCII 1963 CHAR SAIL SailDart RFC 698 SUPDUP ASCII 1967
000 · ·
011 γ γ
012 δ δ
013
014 ± ±
015
026
030 _ _ _
032 EOF ~ ~
033
136 ^ ^
137 _ _
175 ALTMODE } }
176 ESC } } } ~ ~
177 DEL ^ DEL
larsbrinkhoff commented 7 years ago

So I'm guessing the SAIL charset is slightly different from the ITS charset.

lokedhs commented 7 years ago

Commit 1b17f0d71fbaf934efeb1bc9bcbb8b089e51df61 implements these changes. I also enabled %TOSA1 which causes these characters to be displayed properly.

lokedhs commented 7 years ago

With the latest version of the unicode support, most things work, but there are some strange behaviours and I can't really figure out what the cause is. In particular, The C-x character is mapped to left-arrow. If I type C-x twice, I get the following output: ↑X↑X. Pressing backspace once, deletes this and replaces it by a single .

Note that both C-x and _ are both accepted by ITS when used as a redirection character.

larsbrinkhoff commented 7 years ago

There is some good information here, in particular about the 1963 and 1967 versions of ASCII:
http://worldpowersystems.com/J/codes/

Note that the 1963 version had ↑ and ← instead of ^ and _. Since the PDP-6 was born in 1963, it would have used the old glyphs in its software and hardware peripherals. ITS was first implemented on the PDP-6, and became operational in 1967. I suppose it didn't adapt immediately to the revised ASCII definition, so the old glyphs probably lingered on for a while. As new peripherals using the updated glyphs were attached, it probably became acceptable to view ↑/^ and ←/_ as interchangeable.

One important perihperal was the Knight TV terminals. They were the native and canonical ITS terminals for a long time. I wonder which character set they used?

larsbrinkhoff commented 7 years ago

@ams provided information about the CONS/CADR character set. For the codes 000-176, it's identical to SUPDUP. 177 is NUL, not integral.

lokedhs commented 7 years ago

Does that mean that with the exception of 177, the current version is correct?

ams commented 7 years ago

Erm, sorry. That was a cut and paste error. 177 is integral.

See http://bitsavers.trailing-edge.com/pdf/mit/cadr/chinual_6thEd_Jan84/chineualJun84_10_ChrsAndStrs.pdf for the list. Page 7 contains the char. set.

larsbrinkhoff commented 7 years ago

@lokedhs I think you map 173-176 to the SAIL charset which is different from ITS.

ams commented 7 years ago

Does that mean that with the exception of 177, the current version is correct?

Yes.

lokedhs commented 7 years ago

@larsbrinkhoff what should 173-176 be mapped to then?

larsbrinkhoff commented 7 years ago

Same as SUPDUP above: { | } ~. Well, 173 and 174 are already ok, but it kinda looks like they get some special treatment.

ams commented 7 years ago

For reference, this is the Lisp Machine char. set:

000 center-dot                  040 space       100 @           140 `
001 down arrow                  041 !           101 A           141 a
002 alpha                       042 "           102 B           142 b
003 beta                        043 #           103 C           143 c
004 and-sign                    044 $           104 D           144 d
005 not-sign                    045 %           105 E           145 e
006 epsilon                     046 &           106 F           146 f
007 pi                          047 '           107 G           147 g
010 lambda                      050 (           110 H           150 h
011 gamma                       051 )           111 I           151 i
012 delta                       052 *           112 J           152 j
013 uparrow                     053 +           113 K           153 k
014 plus-minus                  054 ,           114 L           154 l
015 circle-plus                 055 -           115 M           155 m
016 infinity                    056 .           116 N           156 n
017 partial delta               057 /           117 O           157 o
020 left horseshoe              060 0           120 P           160 p
021 right horseshoe             061 1           121 Q           161 q
022 up horseshoe                062 2           122 R           162 r
023 down horseshoe              063 3           123 S           163 s
024 universal quantifier        064 4           124 T           164 t
025 existential quantifier      065 5           125 U           165 u
026 circle-X                    066 6           126 V           166 v
027 double-arrow                067 7           127 W           167 w
030 left arrow                  070 8           130 X           170 x
031 right arrow                 071 9           131 Y           171 y
032 not-equals                  072 :           132 Z           172 z
033 diamond (altmode)           073 ;           133 [           173 {
034 less-or-equal               074 <           134 \           174 |
035 greater-or-equal            075 =           135 ]           175 }
036 equivalence                 076 >           136 ^           176 ~
037 or                          077 ?           137 _           177 integral
200 Null character     210 Overstrike    220 Stop-output   230 Roman-iv
201 Break              211 Tab           221 Abort         231 Hand-up
202 Clear              212 Line          222 Resume        232 Hand-down
203 Call               213 Delete        223 Status        233 Hand-left
204 Terminal escape    214 Page          224 End           234 Hand-right
205 Macro/backnext     215 Return        225 Roman-i       235 System
206 Help               216 Quote         226 Roman-ii      236 Network
207 Rubout             217 Hold-output   227 Roman-iii
237-377 reserved for the future

                    The Lisp Machine Character Set
            (all numbers in octal)
larsbrinkhoff commented 7 years ago

The Knight TV source file SYSTEM; TV 132 has the font. And it's identical to the SUPDUP and CADR charset. I'd say that's rather conclusive.

larsbrinkhoff commented 7 years ago

Fixed by #10.

larsbrinkhoff commented 5 years ago

@ams, I don't see many of the the mathematical and APL symbols in the above list.

ams commented 5 years ago

That is right. One would switch the displayed char. set using a special control char. sequence.