UniString? and related changes

jckarter / clay

The Clay programming language

http://claylabs.com/clay

Other

404 stars 34 forks source link

Closed stepancheg closed 11 years ago

stepancheg commented 11 years ago

Pull request contains three commits:

simplification of String? protocols (it is no longer overloadable, string is a Sequence? of Char and nothing more). A lot of code assumes that String? is a sequence (has iterator)
addition of UniString? protocol. UniString? is like String?, but over UniChar
overload printTo for UniString?: sequence of UniChar is printed encoded as UTF-8 now, instead of comma-separated sequence of characters

jckarter commented 11 years ago

This looks good. What do you think about making Unicode characters and UTF-8 strings the default, so that:

Current UniChar becomes Char
Current UniString? becomes String?
String becomes an alias to UTF8[Vector[Byte]], StringLiteralRef becomes an alias to UTF8[ByteStringLiteralRef], etc.

You'd lose O(1) random access and size, but the world runs on UTF-8 anyway, and it'd be nice for strings to be Unicode out of the box.

stepancheg commented 11 years ago

@jckarter what should be String? renamed to?

jckarter commented 11 years ago

Maybe ByteString (and ByteChar)?

stepancheg commented 11 years ago

@jckarter I need to think about it. May I push these patches in the meantime?

jckarter commented 11 years ago

That seems reasonable. I didn't mean to intend you migrate everything to Unicode now—that's obviously going to need more work and design.