Branch name: update_script_samples
Purpose of code changes on this branch: generate script samples based on
exemplar data
When reviewing my code changes, please focus on:
- code
- data changes
- text samples changes
The main code is a tool that takes the exemplar data and merges it to come up
with a single set of exemplar characters. There's a bunch of
currently-disabled code there that I used to look at the exemplar data and try
different ways to do things. Not sure if this should be removed or not, I
figured it should at least be part of the file history in the first checkin.
This code copies and modifies some code from the tool used to generate the
website. That code should be unified and moved somewhere else, but that's for
a later changelist I think.
The data changes are small changes to extra_locale_data and unicode_data. The
additions to extra_locale_data take the ranges that were used to generate some
of the original samples, for the scripts that don't have exemplar data, and
encodes them as UnicodeSets in the exemplar data format. This lets the new code
generate (approximately) the old samples where we don't have any new data. The
other data change provides api to perform a simple lower-to-upper-case mapping
based on the unicodedata.
The text samples are generated straight from the tool. In some cases they are
the same. Some are new. Most are changed somewhat because of the attempt to
include some characters specific to each language that uses the script. This
causes a lot more accented latin characters to be in the latin sample, for
example. The idea here is to give some idea of the breadth of coverage.
Unfortunately, scripts used by many languages tend to have lots of characters,
so we cannot show all of them and need to choose what to show.
rt897bbc9d8243 - update script samples (Roozbeh, please check)
r6f34eaa1192f - update tool
rd410aa2b2d11 - tool
rfcb5b96b0401 - data
After the review, I'll merge this branch into:
/master
Original issue reported on code.google.com by dougf...@google.com on 6 May 2015 at 10:07
Original issue reported on code.google.com by
dougf...@google.com
on 6 May 2015 at 10:07