cnvogelg / amitools

Various tools for using AmigaOS programs on other platforms
250 stars 69 forks source link

vamos does not interpret AmigaOS escape sequences #166

Open reinauer opened 2 years ago

reinauer commented 2 years ago

Escape sequences like 9B 4B or 9B 4A are entirely ignored in output, leaving to interesting artifacts:

vamos -c /home/reinauer/project/vamosrc -- hx68 -A magic.asm -oobj/magic.o -iinclude: �F�KAssembled magic.asm to obj/magic.o.

cnvogelg commented 2 years ago

Whats happening here is that vamos sends the raw output of the program to stdout and then to your terminal emulator. There the interpretation happens: in a modern UTF-8 terminal the byte sequences are assigned differently that in latin-1 the Amiga uses.

So a workaround here is to change the charset in your terminal to latin-1.

Another more convenient solution is to perform a conversion in vamos of the 8-bit ANSI control chars to their 7-bit ESC notation that is understood by modern UTF8 terminals as well. However this needs some infrastructure in vamos namely a distinct console handler like output because the conversion should only applied on this path (and not e.g. for file output). Its on my feature list. I'll take this ticket to track the feature.

wepl commented 2 years ago

Some time ago I tried to find a terminal emulation which understands the Amiga escape sequences, but didn't succeed. A solution for this problem would be welcome.

reinauer commented 2 years ago

I can confirm that this is not a latin-1 issue. Well, I should say, switching both LANG and the GNOME terminal's encoding to iso8859-1 did not make a difference for these control characters.

I had tried something like this, but it didn't make a difference either:

diff --git a/amitools/vamos/filter.py b/amitools/vamos/filter.py
new file mode 100644
index 0000000..550790f
--- /dev/null
+++ b/amitools/vamos/filter.py
@@ -0,0 +1,20 @@
+class AmigaIOFilter(object):
+       def __init__(self, stream):
+               self.strings_to_filter = [ '\x9bK', '\x9bL', '\x9bF' ]
+               self.stream = stream
+
+       def __getattr__(self, attr_name):
+               return getattr(self.stream, attr_name)
+
+       def write(self, data):
+               for string in self.strings_to_filter:
+                       data.replace(string, "**")
+                       data.append("***")
+                       # data = re.sub(r'\b{0}\b'.format(string), '*' * len(string), data)
+
+               self.stream.write(data)
+               self.stream.flush()
+
+       def flush(self):
+               self.stream.flush()
+
diff --git a/amitools/vamos/main.py b/amitools/vamos/main.py
index 3eed36b..a8ff9af 100644
--- a/amitools/vamos/main.py
+++ b/amitools/vamos/main.py
@@ -1,7 +1,9 @@
 import cProfile
 import io
 import pstats
+import sys

+from .filter import AmigaIOFilter
 from .cfg import VamosMainParser
 from .machine import Machine, MemoryMap
 from .machine.regs import REG_D0
@@ -29,6 +31,10 @@ def main(cfg_files=None, args=None, cfg_dict=None, profile=False):
     if an internal error occurred then return:
       RET_CODE_CONFIG_ERROR (1000): config error
     """
+
+    sys.stdout = AmigaIOFilter(sys.stdout)
+    sys.stderr = AmigaIOFilter(sys.stderr)
+
     # --- parse config ---
     mp = VamosMainParser()
     if not mp.parse(cfg_files, args, cfg_dict):
cnvogelg commented 2 years ago

The above approach wouldn't work since sys.stdout is the text stream with encoding (that defaults to utf-8). You have to adapt the underlying raw stream.

But to validate the initial assumption, lets try this:

Create a short sample text with both normal esc and cs0 codes:

$ echo -e '\x1b[31mhello\x1b[0m world' > color1.txt
$ echo -e '\x9b31mhello\x9b0m world' > color2.txt
$ xxd color1.txt
00000000: 1b5b 3331 6d68 656c 6c6f 1b5b 306d 2077  .[31mhello.[0m w
00000010: 6f72 6c64 0a                             orld.
$ xdd color2.txt
00000000: 9b33 316d 6865 6c6c 6f9b 306d 2077 6f72  .31mhello.0m wor
00000010: 6c64 0a                                  ld.

Passing these through vamos creates unaltered results (as expected):

$ vamos type color1.txt > vamos1.txt
$ xxd vamos1.txt
00000000: 1b5b 3331 6d68 656c 6c6f 1b5b 306d 2077  .[31mhello.[0m w
00000010: 6f72 6c64 0a                             orld.
$ vamos type color2.txt > vamos2.txt
00000000: 9b33 316d 6865 6c6c 6f9b 306d 2077 6f72  .31mhello.0m wor
00000010: 6c64 0a                                  ld.

Now what happens when these codes are printed to a terminal... I have a macOS iTerm2 running here and set the Terminal encoding to Latin-1: (Preferences -> Profiles -> Terminal -> Character Encoding : Western ISO (Latin 1)) Note: I did not alter any LANG, LC_CTYPE vars. They do not influence this raw output here.

And voila: looks good to me:

Bildschirmfoto 2022-01-04 um 09 53 42
cnvogelg commented 2 years ago

Same experiment on Debian 11: Using gnome-terminal. Setting Edit -> Preferences -> Profiles -> Compatibility -> Encoding -> Obsolete Encoding -> Western ISO 8859-1

Result:

Bildschirmfoto 2022-01-04 um 10 08 33
reinauer commented 2 years ago

Bizarre. I do the same thing on Ubuntu 21.10 and color1.txt works fine while color2.txt does not. BTW, the behavior is the same here regardless of the encoding.

gnome-terminal --version
# Locale not supported by C library.
#   Using the fallback 'C' locale.
# GNOME Terminal 3.38.1 using VTE 0.64.2 +BIDI +GNUTLS +ICU +SYSTEMD

I tried xterm which behaves like you suggest, and konsole, which also behaves that way after switching to ISO8859-1. So I assume I have to go debug my gnome-terminal instead of filing bugs here. Sorry for the noise ;)

webbasan commented 2 years ago

Back in the time a created some termcap/terminfo entries for the Amiga to have better remote shell support. I guess that would be the right approach. I have to check it out for myself, but if someone wants to try it out for himself, I will attach the terminfo variant rlamiga.gz .

reinauer commented 2 years ago

Back in the time a created some termcap/terminfo entries for the Amiga to have better remote shell support. I guess that would be the right approach. I have to check it out for myself, but if someone wants to try it out for himself, I will attach the terminfo variant rlamiga.gz .

With this one, and after adding the following like, I am getting no more weird characters in my output: clr_eol=\EK, ll=\EF,