Closed gavinking closed 9 years ago
Here is a Java program that scrapes Java for the Unicode general categories of all characters, along with its output. I think we should include these ranges in the language module. Something similar here.
While we're at it, we should scrape java.lang.Character.isDefined()
to get a list of legal codepoints, in order to fix the impl of Integer.character
and let me add a Boolean codePoint
attribute to Integer
.
I can certainly port that Java program to JS. Just please, don't ever again complain about the size of the JS language module, since stuff like this will inevitably bloat it big time.
:-(
@chochos What I suggest we do here is separate out these lists of characters to a separate file (or even files) and load it (them) lazily when needed. WDYT?
That should work. Are there any other resources in the language module already, or is this going to be the first one?
Fixed by ceylon/ceylon.language@32dd5d22fa24a4cd5bc86489f9e94644f437b62a
Excellent, thanks. How big is this file? Does it get fetched eagerly or lazily?
46.7KB and I don't think source files can get lazily loaded. Maybe the info could be stored in a resource files like we do for ceylon.locale
?
yeah it's a source file. The only way to lazily load this is as a separate module (basically include the original module in commonJS format).
I suppose a resource file could work, but I've no idea how to transform this into plain text and I really don't want to use eval
.
Couldn't we make it JSON and use JSON.parse()
?
Wow, the implementation of
Character.letter
for JS is totally utterly broken. We need to fix it urgently.