Unicode to decimal - Githubissues

Mega-Tom commented 8 years ago

Your interpreter has no way to input UTF-8 and output decimal. The print_stack command will print in UTF-8 only if the -u flag is set, and the -A flag is not. The interpreter take UTF-8 input only if both -u and -a are set.

I suggest making a -U flag for UTF-8 output, and limiting the -u to input.

DJMcMayhem commented 8 years ago

I think this is a great idea, and it's something I've been thinking about for a while. I like this plan, but it ove complicates some things. This means we'd have 4 different (possibly contradicting) flags just for various IO modes. And also, what should happen if the -u and -a flags ever contradict each other?

I have a proposal that would simplify the whole thing though. What if we dropped the idea of "UTF-8 IO" entirely, and then changed "ASCII IO" to Unicode IO? I think this has many advantages. For example:

Less flags to remember.
Backwards compatible, other than the fact that ASCII wraps, which to my knowledge, no programs are currently taking advantage of.
More intuitive. Rather than caring about various encodings, you can just think of IO as either "Decimal" or "Character code-points".

I like this approach much more, but I'd like to hear what @Wheatwizard thinks before going and implementing it, especially since I remember them being opposed to this idea before.

Wheatwizard commented 8 years ago

I prefer the different ASCII and UTF-8 flags, but I do agree that the flags as they are now is problematic. I would propose a new 6 flag system (4 flags if we decide to collapse UTF-8 and ASCII):

-a sets ASCII input but does not change output
-A sets ASCII output but does not change input
-d sets decimal input but does not change output
-D set decimal output but does not change input
-u sets UTF-8 input but does not change output
-U sets UTF-8 output but does not change input

These flags can be combined to make all of the possible input and output schemes.

Mega-Tom commented 8 years ago

The problem is that flags are binary while I/O formats are trinary. Codewise, your problem is that booleans contradict each other. You need an enumeration with three states. inputMode = 1 for decimal, 2 for ASCII or 3 for UTF-8.

DJMcMayhem commented 8 years ago

@Wheatwizard That seems too complicated to me. One because decimal is default, so there's no reason to specify it, and two because I don't see why it's necessary to have a distinction between ASCII and UTF 8. I'd like to do it like this:

-a sets ASCII/UTF8 input but does not change output
-A sets ASCII/UTF8 output but does not change input
-c sets ASCII/UTF8 input and output.

Obviously, if we did this we'd have to change ASCII's "Wrap at 127" to "Wrap at 65,536". Do you have any objections?

1000000000 commented 8 years ago

@DJMcMayhem I think your idea for the flags is the best one proposed so far but I think the distinction between ASCII output and Unicode output is important (I totally agree that we can do away with ASCII input though). My proposal would be to add either one or two new flags to flags to handle unicode output. The modifications to existing I/O flags:

-a sets ASCII/UTF8 input but does not change output
-A sets ASCII output but does not change input
-c sets ASCII output an ASCII/UTF-8 input

The new flag

-U sets UTF-8 output but does not change input

We might want to add the following flag as well to shave off an extra byte from ASCII/UTF-8 in UTF-8 out programs

-C sets UTF-8 output and ASCII/UTF-8 input

Overall, this only increases the number of flags by one. The one thing that irks me about this scheme is that it lacks a method for returning to decimal output from ASCII output. I also think we should keep the -u flag around for backwards compatibility but either mark it as deprecated or drop it from our documentation.

Wheatwizard commented 7 years ago

Hey whats going on with this issue. I don't care much which option we choose as long as it is not the current one. Hopefully we can get this fixed soon.

DJMcMayhem commented 7 years ago

This is done in version 1.2! Closing now

DJMcMayhem / Brain-Flak

Unicode to decimal #53