Closed Mega-Tom closed 7 years ago
I think this is a great idea, and it's something I've been thinking about for a while. I like this plan, but it ove complicates some things. This means we'd have 4 different (possibly contradicting) flags just for various IO modes. And also, what should happen if the -u
and -a
flags ever contradict each other?
I have a proposal that would simplify the whole thing though. What if we dropped the idea of "UTF-8 IO" entirely, and then changed "ASCII IO" to Unicode IO? I think this has many advantages. For example:
I like this approach much more, but I'd like to hear what @Wheatwizard thinks before going and implementing it, especially since I remember them being opposed to this idea before.
I prefer the different ASCII and UTF-8 flags, but I do agree that the flags as they are now is problematic. I would propose a new 6 flag system (4 flags if we decide to collapse UTF-8 and ASCII):
-a
sets ASCII input but does not change output-A
sets ASCII output but does not change input-d
sets decimal input but does not change output-D
set decimal output but does not change input-u
sets UTF-8 input but does not change output-U
sets UTF-8 output but does not change inputThese flags can be combined to make all of the possible input and output schemes.
The problem is that flags are binary while I/O formats are trinary. Codewise, your problem is that booleans contradict each other. You need an enumeration with three states. inputMode
= 1 for decimal, 2 for ASCII or 3 for UTF-8.
@Wheatwizard That seems too complicated to me. One because decimal is default, so there's no reason to specify it, and two because I don't see why it's necessary to have a distinction between ASCII and UTF 8. I'd like to do it like this:
-a
sets ASCII/UTF8 input but does not change output-A
sets ASCII/UTF8 output but does not change input-c
sets ASCII/UTF8 input and output.Obviously, if we did this we'd have to change ASCII's "Wrap at 127" to "Wrap at 65,536". Do you have any objections?
@DJMcMayhem I think your idea for the flags is the best one proposed so far but I think the distinction between ASCII output and Unicode output is important (I totally agree that we can do away with ASCII input though). My proposal would be to add either one or two new flags to flags to handle unicode output. The modifications to existing I/O flags:
-a
sets ASCII/UTF8 input but does not change output-A
sets ASCII output but does not change input-c
sets ASCII output an ASCII/UTF-8 inputThe new flag
-U
sets UTF-8 output but does not change inputWe might want to add the following flag as well to shave off an extra byte from ASCII/UTF-8 in UTF-8 out programs
-C
sets UTF-8 output and ASCII/UTF-8 inputOverall, this only increases the number of flags by one. The one thing that irks me about this scheme is that it lacks a method for returning to decimal output from ASCII output. I also think we should keep the -u
flag around for backwards compatibility but either mark it as deprecated or drop it from our documentation.
Hey whats going on with this issue. I don't care much which option we choose as long as it is not the current one. Hopefully we can get this fixed soon.
This is done in version 1.2! Closing now
Your interpreter has no way to input UTF-8 and output decimal. The
print_stack
command will print in UTF-8 only if the-u
flag is set, and the-A
flag is not. The interpreter take UTF-8 input only if both-u
and-a
are set.I suggest making a
-U
flag for UTF-8 output, and limiting the-u
to input.