Open tajmone opened 5 years ago
Thanks for the feedback. Not an ALAN user so this is very helpful.
Stale issue message
I've edited the original Issue to add a request for converting ALAN sources to ISO-8859-1 encoding when exporting projects to ALAN.
I would like to add a few considerations here.
In the pre-Unicode era, when code pages and ISO encodings were still the norm, most editors and IDEs had problems handling Unicode/UTF-8 sources. Today the contrary is true, with most editors assuming UTF-8 as the base encoding and switching to it whenever a multi-byte character is pasted into a source.
Setting a source file to ISO-8858-1 encoding is a manual operation, since there's no BOM-like marker to signal that a source file uses a particular encoding other than ASCII. When an out range character (>$FF
) is present in the clipboard during paste operations, the editor will have to switch to UTF-8 in order to accommodate the character (smart editors might prevent the paste operation). If the clipboard contains a valid ISO character stored as UTF-8 two-bytes (i.e. ISO chars $80-$FF
) the editor will either switch to UTF-8 encoding the source file or convert the clipboard on the fly to preserve the correct encoding.
The problematic ISO characters are those in the 128-255/$80-$FF
range, which include a few commonly used symbols and letters present in many European languages (see ISO-8859-1 code page layout on Wikipedia):
£ ¥ © ¿
À Á Â Ã Ä Å È É Ê Ë Ì Í Î Ï Ò Ó Ô Õ Ö Ù Ú Û Ü
à á â ã ä å è é ê ë ì í î ï ò ó ô õ ö ù ú û ü
Æ Ç Ð Ñ Ø Ý Þ ß
æ ç ð ñ ø ý þ ÿ
As the above table shows, many of these characters are fundamental for writing adventures in many languages other than English, and need to be correctly handled when exporting Trizbort maps to source code for any IF system that does not support UTF-8 sources.
A quick alternative solution to adding to Trizbort an internal encodings converter could be to allow end users to set filters to code export operations for specific IF systems — i.e. specify a tool of their choice to which the generated source is piped to (before saving) at export time. For OSs which natively ship with the iconv tool, this could be the default filter preset. Other OSs would require end users to install and set a tool of their choice.
There are a few problems when exporting maps to Alan (ie. Alan 3, the latest icarnatation of Alan) which need to be addressed:
End
.Hopefully, these will improve Trizbort support for Alan IF authoring.
Don't Add BOM to Source File
The exported ALAN sources contain a BOM, which prevents compiling the generated map-code — indeed, even pasting parts of it into an Alan source could break its encoding in many editors (i.e. cause the editor to switch to UTF-8 encoding).
Export as ISO-8859-1
Alan source files must be in ISO-8859-1 (there are some other possible encodings but they are all single-character encodings). Exporting the file as UTF-8 without BOM should be generally safe enough if the user didn't use any Unicode characters in the project (Alan users are aware of the problem).
If Trizbort could enforce conversion to ISO-8859-1 at export time it would be safer because it would correctly represent those valid ISO-8859-1 characters which are encoded with two-byte sequences in UTF-8 — i.e. ISO chars in the 128-255 (
$80-$FF
) range, which include some currency symbols, vowels and consonants with accents, umlaut or other diacritical marks or ligatures used in some European alphabets.Adding ISO conversion would introduce the complication of how to handle out-of-range characters (i.e. chars beyond 255/
$FF
) — the problem here being that users might use Unicode characters in the Trizbort project to correctly represent text in the map image/PDF file, but will need to omit them from the generated Alan sources, where they are not supported.Alan users should just stick to using only valid ISO-8859-1 characters in their Trizbort projects, to avoid problems when exporting to Alan source.
Identifiers Within Single Quotes
Room, objects and exits identifiers in exported Alan maps should also always be enclosed in single quotes, like Trizbort does with
NAME
. Currently, an exported map location looks like this:whereas it should be:
This would be a safe approach, even with single word identifiers (where this might not be required), because enclosing an ID in single quotes allows stropping Alan keywords and use them as identifiers — eg:
Where
The
andExit
are Alan keyword, but can be safely used in the room an exit ID since they are stropped by single quotes.Escaping Quotes in Identifiers
If an identifier contains an apostrophe (ie. a single quote char
'
') it must be escaped by doubling it:Unconditional Exits Syntax
When an
Exit
has noCheck
clause it should not be followed by anEnd exit.
:should be: