Closed simoncozens closed 10 months ago
Could we separate restructuring without changing the subset:codepoints mappings from altering what's actually in the subsetes? - the former is fine, the latter we want to be somewhat careful with as it will impact user traffic. It's hard to understand what mappings changed when it's all mixed together.
Could we separate restructuring
isn't that in the commit structure already?
I mean separate PRs as the natural way to review is per PR not per commit
OK, I split this into 2 PRs, #3 which is the do-nothing reorganisation, and then this one which applies on top of it.
Ty. If you set the base branch for this then we'll see the right diff by default. GH has become good about updating it on merge (e.g. merge #3, this one will then target main)
I merged the other PR, could you rebase this one and then I'll review.
This PR:
Lib/gfsubsets/data
directory, to be used by a soon-to-come Python library.The only codepoints removed from nam files are those which absolutely shouldn't be there. For example, when we had sample texts in transliterated Sanskrit, we tried adding things like DEVANAGARI SIGN AVAGRAHA to various subsets to try to make them work. That wasn't a good idea and they should be removed. Otherwise, everything else is a superset of what was there before.