Open samcv opened 7 years ago
Can this be somehow generated instead of done manually?
To generate it I would probably have to write a script to read the UNIDATA files. Maybe it would make sense to generate them after we already have some tests in place. Because otherwise how do you test the generator?
More importantly, a large number of these are derived properties, that are set depending on how a bunch of other properties are set, so it would be quite hard to do.
Once we have tests for all of them it may be a good idea to get some generation in place though for certain properties that would be easy to generate for (that don't rely on a large number of other properties).
i'd like to help out with this; can i just start adding tests for properties which haven't yet been checked off?
Yep! See S15-unicode-information/uniprop.t for the properties
Great, thanks! Already looking at it. :-)
Okay, i've now made PR #222; only changed five properties, so you can make sure i'm on the right track.
Good thank you :)
Emoji:
See: http://unicode.org/reports/tr51/#Data_Files for how they are determined (These are Boolean) Values stored here: http://unicode.org/Public/emoji/latest/emoji-data.txt If you hadn't seen this before, really great resource to check which symbols have a property:
http://unicode.org/cldr/utility/properties.html
Emoji_Zwj_Sequences is a property of multiple codepoints together (we may need another routine to do this since it is a property of sequences of codepoints).
Numeric Properties
String Properties
Miscellaneous Properties
Name_Alias and Script_Extensions can hold multiple values. It is not yet determined how we will access them once they are added to some backend
Catalog Properties
Enumerated Properties
Binary Properties
Total: 118 + 6 Emoji
Implementation specific properties
These are not official Unicode properties and should not have tests written for them. They are listed here for completeness.